EPrints Technical Mailing List Archive

Message: #06377


< Previous (by date) | Next (by date) > | < Previous (in thread) | Next (in thread) > | Messages - Most Recent First | Threads - Most Recent First

[EP-tech] EPrint 3.3.11 Failed to OAI import to WorldCat Digital Collection Gateway due to setSpec being larger then 255 characters


I attempting to upload our Metadata from our EPrints version 3.3.11 repository to the WorldCat Digital Collection Gateway using  OAI-PMH.

 

I received the following error message from the Digital Collection Gateway site, from page http://www.worldcat.org/DigitalCollectionGateway/collection_list.jsp

 

## Collection contained data too large for Digitial Collection Gateway.

## Collection constraints:

##

##     SetSpec cannot be larger than 255 characters

##    Name cannot be larger than 1000 characters

 

Examining the results of the OAI-PMH ListSets verb from my repository for large SetSpec values, I see results with setSpec length larger than 255:

 

$ curl http://repository.cshl.edu/cgi/oai2?verb=ListSets > ListSets

  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current

                                 Dload  Upload   Total   Spent    Left  Speed

100  311k    0  311k    0     0  93751      0 --:--:--  0:00:03 --:--:--  225k

$ grep -A 1 -B 0 -E '<setSpec>.{254,}</setSpec>' ListSets

      <setSpec>7375626A656374733D496E76657374696761746976655F546563686E69717565735F616E645F45717569706D656E74:4D6963726F73636F7069635F546563686E69717565735F6F725F65717569706D656E74:6C6173657273:7175616E74697461746976655F6C617365725F7363616E6E696E675F70686F746F7374696D756C6174696F6E</setSpec>

      <setName>Subject = Investigative techniques and equipment: optical devices: lasers: quantitative laser scanning photostimulation</setName>

-- <setSpec>7375626A656374733D42696F696E666F726D6174696373:47656E6F6D696373:47656E6574696373:444E415F7374727563747572655F616E645F6D6F64696669636174696F6E:67656E65735F7374727563747572655F66756E6374696F6E:67656E65735F74797065:74726974686F7261785F67726F75705F67656E6573</setSpec>

      <setName>Subject = bioinformatics: genomics and proteomics: genetics &amp; nucleic acid processing: DNA, RNA structure, function, modification: genes, structure and function: genes: types: trithorax group genes</setName>

--   <setSpec>7375626A656374733D42696F696E666F726D6174696373:47656E6F6D696373:47656E6574696373:444E415F7374727563747572655F616E645F6D6F64696669636174696F6E:67656E65735F7374727563747572655F66756E6374696F6E:67656E655F726567756C6174696F6E:68657465726F64696D6572697A6174696F6E</setSpec>

      <setName>Subject = bioinformatics: genomics and proteomics: genetics &amp; nucleic acid processing: DNA, RNA structure, function, modification: genes, structure and function: gene regulation: heterodimerization</setName>

--   <setSpec>7375626A656374733D42696F696E666F726D6174696373:47656E6F6D696373:47656E6574696373:444E415F7374727563747572655F616E645F6D6F64696669636174696F6E:67656E65735F7374727563747572655F66756E6374696F6E:67656E65735F74797065:6469737275707465645F696E5F736368697A6F706872656E69615F31</setSpec>

      <setName>Subject = bioinformatics: genomics and proteomics: genetics &amp; nucleic acid processing: DNA, RNA structure, function, modification: genes, structure and function: genes: types: disrupted-in-schizophrenia 1</setName>

-- <setSpec>7375626A656374733D42696F696E666F726D6174696373:47656E6F6D696373:47656E6574696373:444E415F7374727563747572655F616E645F6D6F64696669636174696F6E:67656E65735F7374727563747572655F66756E6374696F6E:67656E65735F74797065:696D6D6564696174655F6561726C795F67656E6573</setSpec>

      <setName>Subject = bioinformatics: genomics and proteomics: genetics &amp; nucleic acid processing: DNA, RNA structure, function, modification: genes, structure and function: genes: types: immediate early genes</setName>

-- <setSpec>7375626A656374733D42696F696E666F726D6174696373:47656E6F6D696373:47656E6574696373:444E415F7374727563747572655F616E645F6D6F64696669636174696F6E:67656E65735F7374727563747572655F66756E6374696F6E:67656E655F726567756C6174696F6E:68657465726F64696D6572697A6174696F6E:68657465726F64696D6572</setSpec>

      <setName>Subject = bioinformatics: genomics and proteomics: genetics &amp; nucleic acid processing: DNA, RNA structure, function, modification: genes, structure and function: gene regulation: heterodimerization: heterodimer</setName>

 

It appears that due to the verbosity of MeSH headings, which we are using for subjects, has exceeded WorldCat Digital Collection Gateway capacity.

 

Can anyone suggest a workaround?

Has anyone created a setSpec encode or decode code that will keep the values shorter?

 

Thanks

 

Tom