See the Mailing Lists Page for how to subscribe and unsubscribe.
eprints_tech messages
Please note: this page shows emails that have been sent to the eprints_tech mailing list. Some of these may be spam emails we have failed to filter.
[EP-tech] Re: EPrints and attributes in XML
From: "Helge Knuettel" <Helge.Knuettel AT bibliothek.uni-regensburg.de>
Date: Tue, 11 Nov 2008 13:48:19 +0100
| Threading: | ↑ [EP-tech] EPrints and attributes in XML from Helge.Knuettel AT bibliothek.uni-regensburg.de • This Message → [EP-tech] Re: EPrints and attributes in XML from tdb01r AT ecs.soton.ac.uk |
http://mailman.ecs.soton.ac.uk/mailman/listinfo/eprints-tech *** EPrints community wiki - http://wiki.eprints.org/ Hi Tim, sorry for the delayed answer. In PubMedXML.pm I added the following code to xml_to_epdata # DOI my $pubmeddata = $xml->getElementsByTagName("PubmedData")->item(0); if ( defined $pubmeddata ) { my $articleidlist = $pubmeddata->getElementsByTagName("ArticleIdList")->item(0); if ( defined $articleidlist ) { foreach my $articleid ( $articleidlist->getElementsByTagName("ArticleId") ) { if ( defined $articleid ) { if ( $articleid->getAttribute( "IdType" ) eq "doi" ) { my $doi = {}; $doi->{type} = "doi"; $doi->{name} = $plugin->xml_to_text( $articleid ); push AT { $epdata->{id_number} }, $doi; } } } } } The code trying to access attributes never returmed a result. Therefore, I looked at the XML arriving in xml_to_epdata: sub xml_to_epdata { # $xml is the PubmedArticle element my( $plugin, $dataset, $xml ) = AT _; # For debugging: Check what XML is arriving here: print STDERR $xml->toString(); ... I found that no attributes were present in the output. Ok, this might be due to rendering by toString, too. Currently I am using the following workaround: # DOI my $pubmeddata = $xml->getElementsByTagName("PubmedData")->item(0); if ( defined $pubmeddata ) { my $articleidlist = $pubmeddata->getElementsByTagName("ArticleIdList")->item(0); if ( defined $articleidlist ) { foreach my $articleid ( $articleidlist->getElementsByTagName("ArticleId") ) { if ( defined $articleid ) { # So far no attributes are available in the XML document at this stage. # They were probaly lost when parsing the document. # Therefore we are using a workaround and use any ArticleID as a DOI # when it starts with "10.". This is not always true!!!! my $value = $plugin->xml_to_text( $articleid ); if ( $value =~ m/^10\./ ) { push AT { $epdata->{id_number} }, { 'type' => 'doi', 'name' => $value }; } } } } } Helge -- ---- Dr. rer. nat. Helge Knüttel Fachreferat Medizin, Informationsvermittlung Medizin Universitätsbibliothek Regensburg D-93042 Regensburg, Germany phone: ++49 941 944-5937; fax: ++49 941 944-5938 email: helge.knuettel AT bibliothek.uni-regensburg.de WWW: http://www.bibliothek.uni-regensburg.de/tb/medizin/start.htm ----------------------------------------------------------- >>> Tim Brody <tdb01r AT ecs.soton.ac.uk> schrieb am 29.10.2008 ↵ um 17:29 in Nachricht <49088F53.9000200 AT ecs.soton.ac.uk>: > *** http://www.eprints.org/tech.php/id/%(ID)s > *** EPrints community wiki - http://wiki.eprints.org/ > Helge Knuettel wrote: >> *** http://www.eprints.org/tech.php/id/%(ID)s >> *** EPrints community wiki - http://wiki.eprints.org/ >> Hi, >> >> To me it looks as if EPrints ignores attributes in XML. Is it possible >> to change that without major work? >> >> Background: I am enhancing the PubMed import plugin for our purposes. >> DOI import would be nice. In PubMed's XML the type of an ID (e.g. >> PubMed-ID, DOI and several others) is coded in an attribute. However, >> when dealing with the GDOME XML document in >> EPrints::Plugin::Import::PubMedXML no attributes are there. I assume >> they were lost when parsing the XML string. >> > Could you give me a code example? I don't think anything should be > stripping XML attributes. > > Tim. > > FOOTER (%(LIST)s)
[index] [prev] [next] [options] [help]




