EPrints Technical Mailing List Archive

Message: #02794


< Previous (by date) | Next (by date) > | < Previous (in thread) | Next (in thread) > | Messages - Most Recent First | Threads - Most Recent First

[EP-tech] Re: Updating Subjects


Something like:

# mapping old id => new id
my %SUBJECTSMAP = ( "oldid1" => "newid1", "oldid2" => "newid2" );

# select eprints which have a subject set
my $list = $repo->dataset( 'eprint' )->search( filters => [ {meta_fields => [qw/ subjects /], match => 'SET' } ] );

# map map
$list->map( sub {

        my $eprint = $_[2] or return;

        my @subjects = @{ $eprint->value( 'subjects' ) || [] };
        my $done_any = 0;

        my @new_subjects;
        foreach my $subject (@subjects)
        {
                if( exists $SUBJECTSMAP{$subject} )
                {
                        push @new_subjects, $SUBJECTSMAP{$subject};
                        $done_any++;
                }
                else
                {
                        push @new_subjects, $subject;
                }
        }

        return if !$done_any;

        $eprint->set_value( 'subjects', \@new_subjects );
        $eprint->commit;

} );


Note 1: untested!
Note 2: 3.3 API


Perhaps we should keep some basic scripts on github? What do you reckon?

Seb


On 18/03/14 04:00, Eliseo Gatchalian wrote:
Thanks John and Sebastian for your usual utmost response.

I'm also aware (also scared! lol! ) that I might break something if updates are done directly via SQL, thus looking around for some alternative solutions that others might have already tried.

I agree with you guys that a tool for doing this would be great!  Did you say John that you are going to write one for this? :) :) :)

Thanks again guys!


------------------------------

Message: 2
Date: Mon, 17 Mar 2014 12:26:38 +0000
From: John Salter <J.Salter@leeds.ac.uk>
Subject: [EP-tech] Re: Updating Subjects
To: "eprints-tech@ecs.soton.ac.uk" <eprints-tech@ecs.soton.ac.uk>
Message-ID:
	<7154BCBB8909D642AE6F44CA713DBC200828ADDA4859@HERMES7.ds.leeds.ac.uk>
Content-Type: text/plain; charset="utf-8"

Eliseo,
I?d support Seb?s response ? I always thought my way was ?hacky?.
Maybe I need to write a tool to do this?

Cheers,
John


From: eprints-tech-bounces@ecs.soton.ac.uk [mailto:eprints-tech-bounces@ecs.soton.ac.uk] On Behalf Of Sebastien Francois
Sent: 17 March 2014 12:05
To: eprints-tech@ecs.soton.ac.uk
Subject: [EP-tech] Re: Updating Subjects

Hola,

If I may add to this.... You shouldn't use direct SQL unless there are no other choices - this potentially breaks the relational model.

So instead, you should write a script that transfer objects from one subject id to another, using the EPrints API.

Seb.


On 17/03/14 09:53, John Salter wrote:
Hi Eliseo,
There?s two sides to this:
- getting the new subjects created
- moving eprints from the old subjects to the new ones.

If possible, I?d look at getting the new subjects into order first ? as you?ve already identified, either via the web interface or the import (you might want to look at the subject export too ? which might be useful to do, add your new subjects and then re-import).
Creating a new set of nodes is (IMO) easier than trying to rework the existing ones as they are stored in the database as both child and parent trees, so you?d possibly have to make changes to stuff in all these tables:
subject
subject__index
subject__index_grep
subject__ordervalues_en
subject__rindex
subject_ancestors
subject_name_lang
subject_name_name
subject_parents

To move the eprints from one node to another, if it?s a straight swap (e.g. subjectid_A is replaced by subjected_1; subjectid_B is replaced by subjectid_2 etc.), it?s fairly easy to do in the database (you could even clone the eprint_subjects table and do the updates in the new copy to be on the safe side).

mysql> UPDATE eprint_subjects SET subject = newSubjectID where subject =
mysql> oldSubjectID;

I find the following query useful to see if there are any subjects use, but not defined:
mysql> select count(*),  subject, pos from eprint_subjects where subject
mysql> NOT IN (select subjectid from subject) group by subject, pos;

Hope that helps a bit ? someone else might have some better/easier ways of doing this!
Cheers,
John


From: eprints-tech-bounces@ecs.soton.ac.uk<mailto:eprints-tech-bounces@ecs.soton.ac.uk> [mailto:eprints-tech-bounces@ecs.soton.ac.uk] On Behalf Of Eliseo Gatchalian
Sent: 17 March 2014 03:06
To: eprints-tech@ecs.soton.ac.uk<mailto:eprints-tech@ecs.soton.ac.uk>
Subject: [EP-tech] Updating Subjects

Hi,

I know that we can add and edit the subjects by logging in as admin and can also import a subject_xml file, but  If we are to merge some subjects with different subject code into one, is there an automatic way to update the subject codes for the items already submitted in the archive?

I saw the eprints_subject table wherein we can replace the subject field with the new codes and update the items but I?m not sure if there are anything else that we need to update?

Thanks!



Ellis Gatchalian
Systems Librarian
Wintec
Private Bag 3036, Waikato Mail Centre, Hamilton 3240
Phone: +64-(0)7-834 8800 ext 8633
Fax: +64-(0)7-838 8257
Email: ellis.gatchalian@wintec.ac.nz<mailto:ellis.gatchalian@wintec.ac.nz>
Web: http://www.wintec.ac.nz/

[cid:image001.gif@01CF41DC.21FD4250]







________________________________

This electronic mail transmission is intended for the named recipients only. It may contain private and confidential information. If this has come to you in error you must take no action based upon it, nor must you copy it or show it to anyone; please telephone or email the sender at Wintec immediately and return the original email. We cannot accept any liability for any loss or damage sustained as a result of software viruses. It is your responsibility to carry out such virus checking as is necessary before opening any attachment which may be included with this message.




*** Options: http://mailman.ecs.soton.ac.uk/mailman/listinfo/eprints-tech

*** Archive: http://www.eprints.org/tech.php/

*** EPrints community wiki: http://wiki.eprints.org/

*** EPrints developers Forum: http://forum.eprints.org/

-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mailman.ecs.soton.ac.uk/pipermail/eprints-tech/attachments/20140317/1565cf42/attachment.html
-------------- next part --------------
A non-text attachment was scrubbed...
Name: image001.gif
Type: image/gif
Size: 5148 bytes
Desc: image001.gif
Url : http://mailman.ecs.soton.ac.uk/pipermail/eprints-tech/attachments/20140317/1565cf42/attachment.gif

------------------------------

_______________________________________________
Eprints-tech mailing list
Eprints-tech@ecs.soton.ac.uk
http://mailman.ecs.soton.ac.uk/mailman/listinfo/eprints-tech


End of Eprints-tech Digest, Vol 66, Issue 34
********************************************

*** Options: http://mailman.ecs.soton.ac.uk/mailman/listinfo/eprints-tech
*** Archive: http://www.eprints.org/tech.php/
*** EPrints community wiki: http://wiki.eprints.org/
*** EPrints developers Forum: http://forum.eprints.org/