[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[EP-tech] Experimental Schema.org support for EPrints



Hi Christopher,

nice to see you are back! Concerning the Schema.org support I did
something "custom" here:

http://en.unesco.org/mediabank

I think the proper way to go would be a plugin thought...

Denis


On 21/11/2017 17:46, Christopher Gutteridge wrote:
> Hi, EPrints-tech, long time no-see.
>
> I've recently rejoined the EPrints.soton.ac.uk support team, and was 
> asked about trying out schema.org support (which Google and Bing like). 
> I'm not a huge fan as I like peer-to-peer data, rather than via the big 
> search engines, but I gave it a go anyway.
>
> I have been working on a way to add schema.org support to EPrints. It's 
> using an invisible <div> which may not be everyone's preferred way of 
> doing it, but has the advantage of working well with the citation files.
>
> Other options would be to design the entire abstract page around this 
> feature (possible, but work to add to existing sites) or use JSON-LD 
> which is what I would do if I was doing it for just me, but making a 
> configuration file to generate JSON-LD would be more work for me and 
> more of a learning curve for the EPrints admin.
>
> I've added it as a pilot to https://eprints.soton.ac.uk/ (subject to 
> removal or change at any time)
>
> See the data extracted from a page here: 
> https://search.google.com/structured-data/testing-tool#url=https%3A%2F%2Feprints.soton.ac.uk%2F50995%2F
>
> There's lots more work to polish this, but it's work showing off now.
>
> I've used 3 citation files for this. One outer? one to handle the 
> different types. This is a bit ugly but was the solution I came up with, 
> a second one to process fields that come in a standard install of 
> EPrints, and a third for the fields eprints.soton has customised heavily.
>
> In the main summary_page.xml I added:
>
>  ? <epc:print expr="$item.citation('schema_org')" />
>
> Which links to schema_org.xml:
>
> <?xml version="1.0" ?>
> <!DOCTYPE html SYSTEM "entities.dtd" >
>
> <!--
>  ??? Full "abstract page" (or splash page or summary page, depending on 
> your jargon) for an eprint.
> -->
>
> <cite:citation xmlns="http://www.w3.org/1999/xhtml"; 
> xmlns:epc="http://eprints.org/ep3/control"; 
> xmlns:cite="http://eprints.org/ep3/citation"; >
>
> <div style='display:none'>
>  ? <epc:choose>
>  ??? <epc:when test="type = 'article'">
>  ????? <div itemscope="itemscope" 
> itemtype="http://schema.org/ScholarlyArticle";>
>  ??????? <epc:print expr="$item.citation('schema_org_main')" />
>  ????? </div>
>  ??? </epc:when>
>  ??? <epc:when test="type = 'book'">
>  ????? <div itemscope="itemscope" itemtype="http://schema.org/Book";>
>  ??????? <epc:print expr="$item.citation('schema_org_main')" />
>  ????? </div>
>  ??? </epc:when>
>  ??? <!-- book_section -->
>  ??? <epc:when test="type = 'conference_item'">
>  ????? <div itemscope="itemscope" 
> itemtype="http://schema.org/ScholarlyArticle";>
>  ??????? <epc:print expr="$item.citation('schema_org_main')" />
>  ????? </div>
>  ??? </epc:when>
>  ??? <epc:when test="type = 'monograph'">
>  ????? <div itemscope="itemscope" 
> itemtype="http://schema.org/ScholarlyArticle";>
>  ??????? <epc:print expr="$item.citation('schema_org_main')" />
>  ????? </div>
>  ??? </epc:when>
>  ??? <!-- patent -->
>  ??? <epc:when test="type = 'thesis'">
>  ????? <div itemscope="itemscope" itemtype="http://schema.org/Thesis";>
>  ??????? <epc:print expr="$item.citation('schema_org_main')" />
>  ????? </div>
>  ??? </epc:when>
>  ??? <epc:when test="type = 'dataset'">
>  ????? <div itemscope="itemscope" itemtype="http://schema.org/Dataset";>
>  ??????? <epc:print expr="$item.citation('schema_org_main')" />
>  ????? </div>
>  ??? </epc:when>
>  ??? <!-- ad_item // art design item //? -->
>  ??? <epc:when test="type = 'mu_item'">
>  ????? <div itemscope="itemscope" 
> itemtype="http://schema.org/MusicComposition";>
>  ??????? <epc:print expr="$item.citation('schema_org_main')" />
>  ????? </div>
>  ??? </epc:when>
>  ??? <!-- letter -->
>  ??? <!-- editorial -->
>  ??? <epc:when test="type = 'review'">
>  ????? <div itemscope="itemscope" itemtype="http://schema.org/Review";>
>  ??????? <epc:print expr="$item.citation('schema_org_main')" />
>  ????? </div>
>  ??? </epc:when>
>  ??? <!-- special_issue -->
>  ??? <!-- meeting_abstract -->
>  ??? <!-- software // SoftwareApplication/ SoftwareSourceCode ?? -->
>  ??? <epc:when test="type = 'website'">
>  ????? <div itemscope="itemscope" itemtype="http://schema.org/Website";>
>  ??????? <epc:print expr="$item.citation('schema_org_main')" />
>  ????? </div>
>  ??? </epc:when>
>
>  ??? <epc:otherwise>
>  ????? <div itemscope="itemscope" itemtype="http://schema.org/CreativeWork";>
>  ??????? <epc:print expr="$item.citation('schema_org_main')" />
>  ????? </div>
>  ??? </epc:otherwise>
>  ? </epc:choose>
> </div>
>
> </cite:citation>
>
> Each of these options in turn links to the main one, 
> schama_org_main.xml, that uses default EPrints fields:
>
> <?xml version="1.0" ?>
> <!DOCTYPE html SYSTEM "entities.dtd" >
>
> <cite:citation xmlns="http://www.w3.org/1999/xhtml"; 
> xmlns:epc="http://eprints.org/ep3/control"; 
> xmlns:cite="http://eprints.org/ep3/citation"; >
>
> <div itemprop="name"><epc:print expr="title" /></div>
> <div itemprop="headline"><epc:print expr="title" /></div>
> <img itemprop="image" 
> src="http://www.eprints.org/uk/wp-content/uploads/EprintsServices2015icon.jpg"; 
> />
> <epc:if test="abstract">
>  ? <div itemprop="description"><epc:print expr="abstract" /></div>
> </epc:if>
> <epc:if test="keywords">
>  ? <div itemprop="keywords"><epc:print expr="keywords" /></div>
> </epc:if>
> <epc:if test="isbn">
>  ? <div itemprop="isbn"><epc:print expr="isbn" /></div>
> </epc:if>
> <epc:if test="id_number">
>  ? <div itemprop="identifier"><epc:print expr="id_number" /></div>
> </epc:if>
>
> <epc:if test="issn or series">
>  ? <div itemprop="isPartOf" itemscope="itemscope" 
> itemtype="http://schema.org/Periodical";>
>  ??? <epc:if test="issn"><div itemprop="issn"><epc:print expr="issn" 
> /></div></epc:if>
>  ??? <epc:if test="series"><div itemprop="name"><epc:print expr="series" 
> /></div></epc:if>
>  ? </div>
> </epc:if>
>
> <epc:comment>
>  ? <!-- pageEnd and pageStart could go here but are more bother to 
> extract. -->
> </epc:comment>
>
> <epc:if test="pagerange">
>  ? <div itemprop="pagination"><epc:print expr="as_string(pagerange)" 
> /></div>
> </epc:if>
> <epc:if test="publisher">
>  ? <div itemprop="publisher" itemscope="itemscope" 
> itemtype="http://schema.org/Organization";>
>  ??? <div itemprop="name"><epc:print expr="publisher" /></div>
>  ? </div>
> </epc:if>
> <epc:if test="official_url">
>  ? <div itemprop="url"><epc:print expr="official_url" /></div>
> </epc:if>
>
> <epc:if test="creators">
>  ? <epc:foreach expr="creators" iterator="person">
>  ??? <div itemprop="creator" itemscope="itemscope" 
> itemtype="http://schema.org/Person";>
>  ????? <div itemprop="name"><epc:print 
> expr="$person.subproperty('name')" /></div>
>  ????? <epc:if test="$person.subproperty('id')">
>  ??????? <div itemprop="identifier"><epc:print 
> expr="$person.subproperty('id')" /></div>
>  ????? </epc:if>
>  ??? </div>
>  ? </epc:foreach>
> </epc:if>
> <epc:if test="editors">
>  ? <epc:foreach expr="editors" iterator="person">
>  ??? <div itemprop="editor" itemscope="itemscope" 
> itemtype="http://schema.org/Person";>
>  ????? <div itemprop="name"><epc:print 
> expr="$person.subproperty('name')" /></div>
>  ????? <epc:if test="$person.subproperty('id')">
>  ??????? <div itemprop="identifier"><epc:print 
> expr="$person.subproperty('id')" /></div>
>  ????? </epc:if>
>  ??? </div>
>  ? </epc:foreach>
> </epc:if>
>
> <epc:if test="corp_creators">
>  ? <epc:foreach expr="corp_creators" iterator="org">
>  ??? <div itemprop="creator" itemscope="itemscope" 
> itemtype="http://schema.org/Organization";>
>  ????? <div itemprop="name"><epc:print 
> expr="$person.subproperty('name')" /></div>
>  ??? </div>
>  ? </epc:foreach>
> </epc:if>
>
>
> <epc:comment>
>  ? ADD IN LOCAL EXTENSIONS USING THIS FILE
> </epc:comment>
> <epc:print expr="$item.citation('schema_org_lcoal')" />
>
> </cite:citation>
>
> Finally I created schema_org_local.xml for the fields like date and 
> creators which we've heavily messed around with.
>
> <?xml version="1.0" ?>
> <!DOCTYPE html SYSTEM "entities.dtd" >
>
> <!--
>  ??? Local extra content for schema.org info on summary page.
>
>  ??? This file can be used to add new fields that are not standard for 
> EPrints.
> -->
>
> <cite:citation xmlns="http://www.w3.org/1999/xhtml"; 
> xmlns:epc="http://eprints.org/ep3/control"; 
> xmlns:cite="http://eprints.org/ep3/citation"; >
>
> <epc:if test="dates">
>  ? <epc:foreach expr="dates" iterator="date">
>  ??? <epc:if test="$date.subproperty('date_type') = 'published'">
>  ????? <div itemprop="datePublished"><epc:print 
> expr="$date.subproperty('date')" /></div>
>  ??? </epc:if>
>  ??? <epc:if test="$date.subproperty('date_type') = 'completed'">
>  ????? <div itemprop="dateCompleted"><epc:print 
> expr="$date.subproperty('date')" /></div>
>  ??? </epc:if>
>  ? </epc:foreach>
> </epc:if>
>
> <epc:if test="contributors">
>  ? <epc:foreach expr="contributors" iterator="person">
>  ??? <div itemprop="contributor" itemscope="itemscope" 
> itemtype="http://schema.org/Person";>
>  ????? <div itemprop="name"><epc:print 
> expr="$person.subproperty('name')" /></div>
>  ????? <epc:if test="$person.subproperty('id')">
>  ??????? <div itemprop="identifier"><epc:print 
> expr="$person.subproperty('id')" /></div>
>  ????? </epc:if>
>  ??? </div>
>  ? </epc:foreach>
> </epc:if>
>
> </cite:citation>
>
>
> I'm not sure how useful all this is but figured I'd throw it out there. 
> It uses a default image as for some reason the Google checker insisted. 
> It doesn't link to files or mention subjects, doesn't include URIs 
> properly and doesn't link to ORCID etc. (which is data we have in 
> eprints.soton).
>
>
>


-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 874 bytes
Desc: OpenPGP digital signature
Url : http://mailman.ecs.soton.ac.uk/pipermail/eprints-tech/attachments/20171121/7b3ea6eb/attachment-0001.bin