[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[EP-tech] Re: harvester (question)



Hi Jean-Marie,
I think it?s probably a namespace problem (see references below). If you try
$xml->findnodes( ?//tef:auteur/tef:nom/*? )
do you get any results?

You could also do this via xslt ? if you have any experience of this?
I?m guessing it?s something like this you?re starting with: http://www.abes.fr/abes/documents/tef/recommandation/ex1_theseSimplePDF.xml


These might explain a bit more about namespaces:
http://stackoverflow.com/a/4083929/2455451
http://stackoverflow.com/questions/2673370/why-should-i-use-xpathcontext-with-perls-xmllibxml/2673452#2673452

Cheers,
John

From: eprints-tech-bounces at ecs.soton.ac.uk [mailto:eprints-tech-bounces at ecs.soton.ac.uk] On Behalf Of Jean-Marie Le Bechec
Sent: 03 March 2014 08:18
To: eprints-tech at ecs.soton.ac.uk
Subject: [EP-tech] harvester (question)

hi Seb,

I need to harvest an OAI server in a format other than Dublin Core (TEF format). I can not get specific metadata with the same name.

For example :
...
<tef:thesisAdmin>
                    <tef:auteur>
                      <tef:nom>nom1</tef:nom>
...

and
...
<tef:directeurThese>
                      <tef:nom>nom2</tef:nom>
                      <tef:prenom>Carine</tef:prenom>
                      <tef:autoriteInterne>MADS_DIRECTEUR_DE_THESE_1</tef:autoriteInterne>
                      <tef:autoriteExterne autoriteSource="Sudoc">073367826</tef:autoriteExterne>
                    </tef:directeurThese>
                    <tef:directeurThese>
                      <tef:nom>nom3</tef:nom>
                      <tef:prenom>Louise</tef:prenom>
                      <tef:autoriteInterne>MADS_DIRECTEUR_DE_THESE_2</tef:autoriteInterne>
                      <tef:autoriteExterne autoriteSource="Sudoc">035036672</tef:autoriteExterne>
                    </tef:directeurThese>
...
in the same record !

I need to extract all this data.

I tried things like :

my $nom;
foreach my $node ($xml->findnodes( "//auteur/nom/*" ))
       {
               $nom = $node->textContent;
       }

but it does not work (no result)

any idea ?


Thanks !

Jean-Marie



--



***********************************************

Jean Marie Le Bechec

Service Commun de la Documentation

Responsable ingenierie documentaire

&

Direction du Systeme d'Information

Referent Etudes



Institut National Polytechnique de Toulouse

6 allee Emile Monso - bp 34038 -

31029 Toulouse cedex 4

Tel : 05 34 32 31 16

Mail : lebechec at inp-toulouse.fr<mailto:lebechec at inp-toulouse.fr>

***********************************************
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mailman.ecs.soton.ac.uk/pipermail/eprints-tech/attachments/20140303/71901cd6/attachment-0001.html