[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[EP-tech] Re: Updating Subjects



Something like:

# mapping old id => new id
my %SUBJECTSMAP = ( "oldid1" => "newid1", "oldid2" => "newid2" );

# select eprints which have a subject set
my $list = $repo->dataset( 'eprint' )->search( filters => [ {meta_fields 
=> [qw/ subjects /], match => 'SET' } ] );

# map map
$list->map( sub {

         my $eprint = $_[2] or return;

         my @subjects = @{ $eprint->value( 'subjects' ) || [] };
         my $done_any = 0;

         my @new_subjects;
         foreach my $subject (@subjects)
         {
                 if( exists $SUBJECTSMAP{$subject} )
                 {
                         push @new_subjects, $SUBJECTSMAP{$subject};
                         $done_any++;
                 }
                 else
                 {
                         push @new_subjects, $subject;
                 }
         }

         return if !$done_any;

         $eprint->set_value( 'subjects', \@new_subjects );
         $eprint->commit;

} );


Note 1: untested!
Note 2: 3.3 API


Perhaps we should keep some basic scripts on github? What do you reckon?

Seb


On 18/03/14 04:00, Eliseo Gatchalian wrote:
> Thanks John and Sebastian for your usual utmost response.
>
> I'm also aware (also scared! lol! ) that I might break something if updates are done directly via SQL, thus looking around for some alternative solutions that others might have already tried.
>
> I agree with you guys that a tool for doing this would be great!  Did you say John that you are going to write one for this? :) :) :)
>
> Thanks again guys!
>
>
> ------------------------------
>
> Message: 2
> Date: Mon, 17 Mar 2014 12:26:38 +0000
> From: John Salter <J.Salter at leeds.ac.uk>
> Subject: [EP-tech] Re: Updating Subjects
> To: "eprints-tech at ecs.soton.ac.uk" <eprints-tech at ecs.soton.ac.uk>
> Message-ID:
> 	<7154BCBB8909D642AE6F44CA713DBC200828ADDA4859 at HERMES7.ds.leeds.ac.uk>
> Content-Type: text/plain; charset="utf-8"
>
> Eliseo,
> I?d support Seb?s response ? I always thought my way was ?hacky?.
> Maybe I need to write a tool to do this?
>
> Cheers,
> John
>
>
> From: eprints-tech-bounces at ecs.soton.ac.uk [mailto:eprints-tech-bounces at ecs.soton.ac.uk] On Behalf Of Sebastien Francois
> Sent: 17 March 2014 12:05
> To: eprints-tech at ecs.soton.ac.uk
> Subject: [EP-tech] Re: Updating Subjects
>
> Hola,
>
> If I may add to this.... You shouldn't use direct SQL unless there are no other choices - this potentially breaks the relational model.
>
> So instead, you should write a script that transfer objects from one subject id to another, using the EPrints API.
>
> Seb.
>
>
> On 17/03/14 09:53, John Salter wrote:
> Hi Eliseo,
> There?s two sides to this:
> - getting the new subjects created
> - moving eprints from the old subjects to the new ones.
>
> If possible, I?d look at getting the new subjects into order first ? as you?ve already identified, either via the web interface or the import (you might want to look at the subject export too ? which might be useful to do, add your new subjects and then re-import).
> Creating a new set of nodes is (IMO) easier than trying to rework the existing ones as they are stored in the database as both child and parent trees, so you?d possibly have to make changes to stuff in all these tables:
> subject
> subject__index
> subject__index_grep
> subject__ordervalues_en
> subject__rindex
> subject_ancestors
> subject_name_lang
> subject_name_name
> subject_parents
>
> To move the eprints from one node to another, if it?s a straight swap (e.g. subjectid_A is replaced by subjected_1; subjectid_B is replaced by subjectid_2 etc.), it?s fairly easy to do in the database (you could even clone the eprint_subjects table and do the updates in the new copy to be on the safe side).
>
> mysql> UPDATE eprint_subjects SET subject = newSubjectID where subject =
> mysql> oldSubjectID;
>
> I find the following query useful to see if there are any subjects use, but not defined:
> mysql> select count(*),  subject, pos from eprint_subjects where subject
> mysql> NOT IN (select subjectid from subject) group by subject, pos;
>
> Hope that helps a bit ? someone else might have some better/easier ways of doing this!
> Cheers,
> John
>
>
> From: eprints-tech-bounces at ecs.soton.ac.uk<mailto:eprints-tech-bounces at ecs.soton.ac.uk> [mailto:eprints-tech-bounces at ecs.soton.ac.uk] On Behalf Of Eliseo Gatchalian
> Sent: 17 March 2014 03:06
> To: eprints-tech at ecs.soton.ac.uk<mailto:eprints-tech at ecs.soton.ac.uk>
> Subject: [EP-tech] Updating Subjects
>
> Hi,
>
> I know that we can add and edit the subjects by logging in as admin and can also import a subject_xml file, but  If we are to merge some subjects with different subject code into one, is there an automatic way to update the subject codes for the items already submitted in the archive?
>
> I saw the eprints_subject table wherein we can replace the subject field with the new codes and update the items but I?m not sure if there are anything else that we need to update?
>
> Thanks!
>
>
>
> Ellis Gatchalian
> Systems Librarian
> Wintec
> Private Bag 3036, Waikato Mail Centre, Hamilton 3240
> Phone: +64-(0)7-834 8800 ext 8633
> Fax: +64-(0)7-838 8257
> Email: ellis.gatchalian at wintec.ac.nz<mailto:ellis.gatchalian at wintec.ac.nz>
> Web: http://www.wintec.ac.nz/
>
> [cid:image001.gif at 01CF41DC.21FD4250]
>
>
>
>
>
>
>
> ________________________________
>
> This electronic mail transmission is intended for the named recipients only. It may contain private and confidential information. If this has come to you in error you must take no action based upon it, nor must you copy it or show it to anyone; please telephone or email the sender at Wintec immediately and return the original email. We cannot accept any liability for any loss or damage sustained as a result of software viruses. It is your responsibility to carry out such virus checking as is necessary before opening any attachment which may be included with this message.
>
>
>
>
> *** Options: http://mailman.ecs.soton.ac.uk/mailman/listinfo/eprints-tech
>
> *** Archive: http://www.eprints.org/tech.php/
>
> *** EPrints community wiki: http://wiki.eprints.org/
>
> *** EPrints developers Forum: http://forum.eprints.org/
>
> -------------- next part --------------
> An HTML attachment was scrubbed...
> URL: http://mailman.ecs.soton.ac.uk/pipermail/eprints-tech/attachments/20140317/1565cf42/attachment.html
> -------------- next part --------------
> A non-text attachment was scrubbed...
> Name: image001.gif
> Type: image/gif
> Size: 5148 bytes
> Desc: image001.gif
> Url : http://mailman.ecs.soton.ac.uk/pipermail/eprints-tech/attachments/20140317/1565cf42/attachment.gif
>
> ------------------------------
>
> _______________________________________________
> Eprints-tech mailing list
> Eprints-tech at ecs.soton.ac.uk
> http://mailman.ecs.soton.ac.uk/mailman/listinfo/eprints-tech
>
>
> End of Eprints-tech Digest, Vol 66, Issue 34
> ********************************************
>
> *** Options: http://mailman.ecs.soton.ac.uk/mailman/listinfo/eprints-tech
> *** Archive: http://www.eprints.org/tech.php/
> *** EPrints community wiki: http://wiki.eprints.org/
> *** EPrints developers Forum: http://forum.eprints.org/