EPrints Technical Mailing List Archive

Message: #07729


< Previous (by date) | Next (by date) > | < Previous (in thread) | Next (in thread) > | Messages - Most Recent First | Threads - Most Recent First

[EP-tech] Partitioning access table (INNODB)


Hello everybody,

 

our eprints repository has an access table with more than 133 million records and uploading.

 

We recently updated the IRStats module version 1.1 in our test repository, and the reindexing has been running for 9 days and counting.

 

We have also converted the access table to INNODB and compressed (it took 7 hours).

 

The MySQL slow queries log reports:

 

-   At first it took 2 seconds per query: Query_time: 2.068357 

 

Time: 190304  0:13:52

# User@Host: eprintsdbo[eprintsdbo] @  [10.147.128.44]  Id:    80

# Query_time: 2.068357  Lock_time: 0.000141 Rows_sent: 100000  Rows_examined: 700000

SET timestamp=1551654832;

SELECT `accessid`,`datestamp_year`,`datestamp_month`,`datestamp_day`,`datestamp_hour`,`datestamp_minute`,`datestamp_second`,`requester_id`,`requester_user_agent`,`referring_entity_id`,`service_type_id`,`referent_id`,`referent_docid` FROM `access` LIMIT 100000 OFFSET 600000;

 

-   Nine days later .., it took 660 seconds per query: Query_time: 661.963604

 

# Time: 190312 12:33:07

# User@Host: eprintsdbo[eprintsdbo] @  [10.147.128.44]  Id:  1077

# Query_time: 661.963604  Lock_time: 0.000180 Rows_sent: 99787  Rows_examined: 123899787

SET timestamp=1552390387;

SELECT `accessid`,`datestamp_year`,`datestamp_month`,`datestamp_day`,`datestamp_hour`,`datestamp_minute`,`datestamp_second`,`requester_id`,`requester_user_agent`,`referring_entity_id`,`service_type_id`,`referent_id`,`referent_docid` FROM `access` LIMIT 100000 OFFSET 123800000;

 

I was wondering if someone has partitioned the access table (for example every 10 million records - access) and if this would improve the generation of statistics.

 

Regards,

JC

 

 

 

 

 

cid:image005.jpg@01D39E6C.6586DC70

 

Juan Carlos Herraiz Regidor

Gobierno TI

Servicios Informáticos · Gobierno TI. Avenida Complutense s/n. 28040 Madrid

Teléfono: +34 91 394 5130, Fax: +34 91 394 4773

www.ucm.es

___

La información contenida en este correo es CONFIDENCIAL, de uso exclusivo del destinatario/a arriba mencionado. Si ha recibido este mensaje por error, notifíquelo inmediatamente por esta misma vía y proceda a su eliminación, ya que ud. tiene totalmente prohibida cualquier utilización del mismo, en virtud de la legislación vigente.

 

Los datos personales recogidos serán incorporados y tratados en el fichero 'Correoweb', bajo la titularidad del Vicerrectorado de Tecnologías de la Información, y en él el interesado/a podrá ejercer los derechos de acceso, rectificación, cancelación y oposición ante el mismo (artículo 5 de la Ley Orgánica 15/1999, de 13 de diciembre, de Protección de Datos de Carácter Personal).

cid:image006.png@01D39E6C.6586DC70 Antes de imprimir este correo piense si es necesario: el medioambiente es cosa de todos.

 

This message is private and confidential and it is intended exclusively for the addressee. If you receive this message by mistake, you should not disseminate, distribute or copy this e-mail. Please inform the sender and delete the message and attachments from your system, as it is completely forbidden for you to use this information, according to the current legislation. No confidentiality nor any privilege regarding the information is waived or lost by any mistransmission or malfunction.

 

The personal data herein will be collected in the file "Correoweb", under the ownership of the Vice-Rectorate for Information Technologies, in which those interested may exercise their right to access, rectify, erasure or right to object the contents (article 15-21 of Regulation (EU) 2016/679, General Data Protection Regulation).

 

cid:image006.png@01D39E6C.6586DC70 Before printing this mail please consider whether it is really necessary: the environment is a concern for us all.