EPrints Technical Mailing List Archive

Message: #02797


< Previous (by date) | Next (by date) > | < Previous (in thread) | Next (in thread) > | Messages - Most Recent First | Threads - Most Recent First

[EP-tech] Re: Updating Subjects


Thanks Sebastian!  Yes it would be good if there is a tool that we can use and download via github as I may not be alone as others may find frequent changing of their subjects.

Best regards,
Ellis

----------------------------------------------------------------------

Message: 1
Date: Tue, 18 Mar 2014 10:34:50 +0000
From: Sebastien Francois <sf2@ecs.soton.ac.uk>
Subject: [EP-tech] Re: Updating Subjects
To: eprints-tech@ecs.soton.ac.uk
Message-ID: <5328214A.8080605@ecs.soton.ac.uk>
Content-Type: text/plain; charset=ISO-8859-1; format=flowed

Something like:

# mapping old id => new id
my %SUBJECTSMAP = ( "oldid1" => "newid1", "oldid2" => "newid2" );

# select eprints which have a subject set my $list = $repo->dataset( 'eprint' )->search( filters => [ {meta_fields => [qw/ subjects /], match => 'SET' } ] );

# map map
$list->map( sub {

         my $eprint = $_[2] or return;

         my @subjects = @{ $eprint->value( 'subjects' ) || [] };
         my $done_any = 0;

         my @new_subjects;
         foreach my $subject (@subjects)
         {
                 if( exists $SUBJECTSMAP{$subject} )
                 {
                         push @new_subjects, $SUBJECTSMAP{$subject};
                         $done_any++;
                 }
                 else
                 {
                         push @new_subjects, $subject;
                 }
         }

         return if !$done_any;

         $eprint->set_value( 'subjects', \@new_subjects );
         $eprint->commit;

} );


Note 1: untested!
Note 2: 3.3 API


Perhaps we should keep some basic scripts on github? What do you reckon?

Seb


On 18/03/14 04:00, Eliseo Gatchalian wrote:
> Thanks John and Sebastian for your usual utmost response.
>
> I'm also aware (also scared! lol! ) that I might break something if updates are done directly via SQL, thus looking around for some alternative solutions that others might have already tried.
>
> I agree with you guys that a tool for doing this would be great!  Did 
> you say John that you are going to write one for this? :) :) :)
>
> Thanks again guys!
>
>
> ------------------------------
>
> Message: 2
> Date: Mon, 17 Mar 2014 12:26:38 +0000
> From: John Salter <J.Salter@leeds.ac.uk>
> Subject: [EP-tech] Re: Updating Subjects
> To: "eprints-tech@ecs.soton.ac.uk" <eprints-tech@ecs.soton.ac.uk>
> Message-ID:
> 	<7154BCBB8909D642AE6F44CA713DBC200828ADDA4859@HERMES7.ds.leeds.ac.uk>
> Content-Type: text/plain; charset="utf-8"
>
> Eliseo,
> I?d support Seb?s response ? I always thought my way was ?hacky?.
> Maybe I need to write a tool to do this?
>
> Cheers,
> John
>
>
> From: eprints-tech-bounces@ecs.soton.ac.uk 
> [mailto:eprints-tech-bounces@ecs.soton.ac.uk] On Behalf Of Sebastien 
> Francois
> Sent: 17 March 2014 12:05
> To: eprints-tech@ecs.soton.ac.uk
> Subject: [EP-tech] Re: Updating Subjects
>
> Hola,
>
> If I may add to this.... You shouldn't use direct SQL unless there are no other choices - this potentially breaks the relational model.
>
> So instead, you should write a script that transfer objects from one subject id to another, using the EPrints API.
>
> Seb.
>
>
> On 17/03/14 09:53, John Salter wrote:
> Hi Eliseo,
> There?s two sides to this:
> - getting the new subjects created
> - moving eprints from the old subjects to the new ones.
>
> If possible, I?d look at getting the new subjects into order first ? as you?ve already identified, either via the web interface or the import (you might want to look at the subject export too ? which might be useful to do, add your new subjects and then re-import).
> Creating a new set of nodes is (IMO) easier than trying to rework the existing ones as they are stored in the database as both child and parent trees, so you?d possibly have to make changes to stuff in all these tables:
> subject
> subject__index
> subject__index_grep
> subject__ordervalues_en
> subject__rindex
> subject_ancestors
> subject_name_lang
> subject_name_name
> subject_parents
>
> To move the eprints from one node to another, if it?s a straight swap (e.g. subjectid_A is replaced by subjected_1; subjectid_B is replaced by subjectid_2 etc.), it?s fairly easy to do in the database (you could even clone the eprint_subjects table and do the updates in the new copy to be on the safe side).
>
> mysql> UPDATE eprint_subjects SET subject = newSubjectID where subject 
> mysql> = oldSubjectID;
>
> I find the following query useful to see if there are any subjects use, but not defined:
> mysql> select count(*),  subject, pos from eprint_subjects where 
> mysql> subject NOT IN (select subjectid from subject) group by 
> mysql> subject, pos;
>
> Hope that helps a bit ? someone else might have some better/easier ways of doing this!
> Cheers,
> John
>
>
> From: 
> eprints-tech-bounces@ecs.soton.ac.uk<mailto:eprints-tech-bounces@ecs.s
> oton.ac.uk> [mailto:eprints-tech-bounces@ecs.soton.ac.uk] On Behalf Of 
> Eliseo Gatchalian
> Sent: 17 March 2014 03:06
> To: eprints-tech@ecs.soton.ac.uk<mailto:eprints-tech@ecs.soton.ac.uk>
> Subject: [EP-tech] Updating Subjects
>
> Hi,
>
> I know that we can add and edit the subjects by logging in as admin and can also import a subject_xml file, but  If we are to merge some subjects with different subject code into one, is there an automatic way to update the subject codes for the items already submitted in the archive?
>
> I saw the eprints_subject table wherein we can replace the subject field with the new codes and update the items but I?m not sure if there are anything else that we need to update?
>
> Thanks!
>
>
>
> Ellis Gatchalian
> Systems Librarian
> Wintec
> Private Bag 3036, Waikato Mail Centre, Hamilton 3240
> Phone: +64-(0)7-834 8800 ext 8633
> Fax: +64-(0)7-838 8257
> Email: 
> ellis.gatchalian@wintec.ac.nz<mailto:ellis.gatchalian@wintec.ac.nz>
> Web: http://www.wintec.ac.nz/
>
> [cid:image001.gif@01CF41DC.21FD4250]
>
>
>
>
>
>
>
> ________________________________
>
> This electronic mail transmission is intended for the named recipients only. It may contain private and confidential information. If this has come to you in error you must take no action based upon it, nor must you copy it or show it to anyone; please telephone or email the sender at Wintec immediately and return the original email. We cannot accept any liability for any loss or damage sustained as a result of software viruses. It is your responsibility to carry out such virus checking as is necessary before opening any attachment which may be included with this message.
>
>
>
>
> *** Options: 
> http://mailman.ecs.soton.ac.uk/mailman/listinfo/eprints-tech
>
> *** Archive: http://www.eprints.org/tech.php/
>
> *** EPrints community wiki: http://wiki.eprints.org/
>
> *** EPrints developers Forum: http://forum.eprints.org/
>
> -------------- next part -------------- An HTML attachment was 
> scrubbed...
> URL: 
> http://mailman.ecs.soton.ac.uk/pipermail/eprints-tech/attachments/2014
> 0317/1565cf42/attachment.html
> -------------- next part -------------- A non-text attachment was 
> scrubbed...
> Name: image001.gif
> Type: image/gif
> Size: 5148 bytes
> Desc: image001.gif
> Url : 
> http://mailman.ecs.soton.ac.uk/pipermail/eprints-tech/attachments/2014
> 0317/1565cf42/attachment.gif
>
> ------------------------------
>
> _______________________________________________
> Eprints-tech mailing list
> Eprints-tech@ecs.soton.ac.uk
> http://mailman.ecs.soton.ac.uk/mailman/listinfo/eprints-tech
>
>
> End of Eprints-tech Digest, Vol 66, Issue 34
> ********************************************
>
> *** Options: 
> http://mailman.ecs.soton.ac.uk/mailman/listinfo/eprints-tech
> *** Archive: http://www.eprints.org/tech.php/
> *** EPrints community wiki: http://wiki.eprints.org/
> *** EPrints developers Forum: http://forum.eprints.org/



------------------------------