[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[EP-tech] Primary key on [dataset]__index tables



Hi Tomasz,
I'm glad I'm not the only one :o)

I think that the following SQL will test that you've got no duplicates in the table:
mysql> SELECT fieldword, pos, COUNT(*) c FROM eprint__index GROUP BY fieldword, pos HAVING c > 1;

If that results in an empty set, you might be in a good position to define the PK...
mysql> ALTER TABLE eprint__index ADD PRIMARY KEY(fieldword, pos);

**WARNING: I haven't done this yet** I'm still investigating!

I haven't started investigating the character set issue (anyone fancy taking a look at this?).

Cheers,
John


From: eprints-tech-bounces at ecs.soton.ac.uk [mailto:eprints-tech-bounces at ecs.soton.ac.uk] On Behalf Of Tomasz Neugebauer
Sent: 10 August 2016 20:17
To: eprints-tech at ecs.soton.ac.uk
Subject: Re: [EP-tech] Primary key on [dataset]__index tables

Hi John,

Thanks for sharing this!  We are also running 3.3.12 (upgraded from 3.2.4), and we have this:

*************************** 1. row ***************************
       Table: eprint__index
Create Table: CREATE TABLE `eprint__index` (
  `fieldword` varchar(255) DEFAULT NULL,
  `pos` int(11) DEFAULT NULL,
  `ids` text,
  KEY `eprint__index_pos` (`pos`),
  KEY `eprint__index_fieldword_pos` (`fieldword`,`pos`)
) ENGINE=MyISAM DEFAULT CHARSET=latin1
1 row in set (0.00 sec)


No primary key, and no UTF-8 on the CHARSET.   That is not so good - how do we fix this?
Could it also explain why our search is having difficulties with some French characters?

Best wishes,

Tomasz



From: eprints-tech-bounces at ecs.soton.ac.uk<mailto:eprints-tech-bounces at ecs.soton.ac.uk> [mailto:eprints-tech-bounces at ecs.soton.ac.uk] On Behalf Of John Salter
Sent: August-08-16 9:47 AM
To: 'eprints-tech at ecs.soton.ac.uk' <eprints-tech at ecs.soton.ac.uk<mailto:eprints-tech at ecs.soton.ac.uk>>
Subject: [EP-tech] Primary key on [dataset]__index tables

Hi,
I've been investigating some slowness issues with our database and discovered that some of the tables don't have primary keys - e.g. eprint__index.
On our live system running 3.3.10 (migrated from 3.1.1, and 2.[something] before then), this is the table definition:
mysql> SHOW CREATE TABLE eprint__index \G
*************************** 1. row ***************************
       Table: eprint__index
Create Table: CREATE TABLE `eprint__index` (
  `fieldword` varchar(255) DEFAULT NULL,
  `pos` int(11) DEFAULT NULL,
  `ids` text,
  KEY `pos` (`pos`),
  KEY `fieldword` (`fieldword`,`pos`)
) ENGINE=MyISAM DEFAULT CHARSET=latin1

There are two keys, but no primary key.

On a dev install of EPrints 3.3.12, this is the table definition:
mysql> SHOW CREATE TABLE eprint__index \G
*************************** 1. row ***************************
       Table: eprint__index
Create Table: CREATE TABLE `eprint__index` (
  `fieldword` varchar(128) CHARACTER SET utf8 COLLATE utf8_bin NOT NULL,
  `pos` int(11) NOT NULL,
  `ids` longtext CHARACTER SET utf8 COLLATE utf8_bin NOT NULL,
  PRIMARY KEY (`fieldword`,`pos`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8

A primary key exists.
There are differences in the version of MySQL, but it looks like there's a possibility that during an upgrade either the __index tables were missed out of some update, or excluded for some reason.
There is also the difference in character set - which should be caught by ~/bin/epadmin:upgrade_3_1_2_to_3_2_0

Inspecting the code, this block: https://github.com/eprints/eprints/blob/3.3/perl_lib/EPrints/Database.pm#L599-L605 will not add the primary key to existing tables.

Can anyone shed any light on this?
Do your __index tables have a primary key?

Cheers,
John

-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mailman.ecs.soton.ac.uk/pipermail/eprints-tech/attachments/20160815/4f484165/attachment-0001.html