EPrints Technical Mailing List Archive

Message: #06522

< Previous (by date) | Next (by date) > | < Previous (in thread) | Next (in thread) > | Messages - Most Recent First | Threads - Most Recent First

Re: [EP-tech] Bulk updating questions

Hi John,


Many thanks for getting back to me about this – sorry it’s taken so long for me to pick this back up.


It’s a shame there doesn’t seem to be a lot of info about the REST service around – this is certainly how I’d prefer to do things and it’ll allow me to do it in something other than Perl which is by no means a strong area for me these days!  But you’ve given me some great advice here!  I will certainly look into updating the staff search by adding extra fields.  I’m not sure if creator_id would exactly work, because we may have members of a team adding items for others but I want to be able to find any items where a particular person is listed as an author (so not necessarily the creator).  But it’s a start!


The subject to be added would be in a separate table.  We have an eprints_research_units tab;e that holds the eprintid, pos, and research_units field.  So what I’m trying to do is add a new row in that table for every eprintid where that eprint item has an author id that is in a set derived from various users.userid values.


So I’ll investigate the staff search now but if I get stuck I may very well take you up on your offer of an example script. J


Many thanks,







From: eprints-tech-bounces@ecs.soton.ac.uk [mailto:eprints-tech-bounces@ecs.soton.ac.uk] On Behalf Of John Salter
Sent: 17 May 2017 12:00
To: eprints-tech@ecs.soton.ac.uk
Subject: Re: [EP-tech] Bulk updating questions


Hi Andy,

I used to do things like this in SQL, but now try to do things the EPrints way wherever possible.

I can't provide much info on the REST interface, but the following might be useful:


- You can add additional fields to the Staff search: https://wiki.eprints.org/w/Search.pl#Adding_fields_to_the_staff_search - e.g. you could add creator_id


- You could write a script to search for items, and add subjects to them automatically. There is real power here - if you can get the search working for you (I always include a 'dry run' option to see what would get changed before actually making changes).

Is the subject-to-be-added a map (userid => subjected), or is it an attribute on the user object ($user->value( "department" ))?

I can provide an example that might get you part way to where you need to be.


The benefit of doing it the EPrints way is:

- your items will have a revision (and a 'history' entry showing what made the change)

- the summary page will be regenerated

- they'll appear as updated records in the OAI-PMH interface

- any indexing that should happen will

(possibly more!)


Hope that helps a bit - if you want an example script let me know and I'll see what I can do!






From: eprints-tech-bounces@ecs.soton.ac.uk [mailto:eprints-tech-bounces@ecs.soton.ac.uk] On Behalf Of Andrew Collington
Sent: 17 May 2017 11:15
To: eprints-tech@ecs.soton.ac.uk
Subject: [EP-tech] Bulk updating questions


Hi all,


We need to do some bulk updating but are worried that the built in search in the web interface doesn’t seem to allow us to get too specific and may cause conflicts on names (if they are the same/similar but different people).  I’ve see mentioned of a REST API but cannot find any documentation on setting it up or using it, so am considering just updating directly via SQL instead.


The part we want to bulk update is what research unit the item is associated to (essentially adding a subject to the item).  We want to do this for any items on which specific users are authors – not necessarily ones that they’ve added to the system themselves.  However, I’m having trouble finding the right information in the database.  I have the list of users.userid and can obviously link that to the eprint.userid, but that only counts where they’ve added the item.  What I can’t find is any kind of linking table for author ids to eprint ids.  At first I thought it might be the eprint_contributors_id table but am not so sure, eprint_creators_id only seems to have email addresses for the creators_id (or null)… I’m sure this is all possible and I’m just missing something really obvious!


So given the above I have (naturally!) a number of questions:


1)      Is it possible to update the search so instead of author name you could put in a user id and get all of the items for which they’re an author?

2)      Is there any documentation on setting up and using the REST API?

3)      Can anyone let me know the tables/fields where I can link author id to user id to eprint id?


The latter seems like it should be really easy (if I could find the right linking tables/fields), and the former requires the core code to be updated which I’d prefer not to if I can help it.  #2 seems like it should be the “right” way to go about it, assuming I can query it sufficiently and then make updates via the REST API and if there’s any documentation.


Many thanks for any advice!







Andrew Collington
Web Developer, ITS Applications
Shawcross, University of Sussex, Falmer, Brighton, BN1 9QT

T: (01273) 872591 (ext. 2591)
E: a.p.collington@sussex.ac.uk