[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[EP-tech] Experimental Schema.org support for EPrints
- Subject: [EP-tech] Experimental Schema.org support for EPrints
- From: cjg at ecs.soton.ac.uk (Christopher Gutteridge)
- Date: Tue, 21 Nov 2017 16:46:27 +0000
Hi, EPrints-tech, long time no-see.
I've recently rejoined the EPrints.soton.ac.uk support team, and was
asked about trying out schema.org support (which Google and Bing like).
I'm not a huge fan as I like peer-to-peer data, rather than via the big
search engines, but I gave it a go anyway.
I have been working on a way to add schema.org support to EPrints. It's
using an invisible <div> which may not be everyone's preferred way of
doing it, but has the advantage of working well with the citation files.
Other options would be to design the entire abstract page around this
feature (possible, but work to add to existing sites) or use JSON-LD
which is what I would do if I was doing it for just me, but making a
configuration file to generate JSON-LD would be more work for me and
more of a learning curve for the EPrints admin.
I've added it as a pilot to https://eprints.soton.ac.uk/ (subject to
removal or change at any time)
See the data extracted from a page here:
https://search.google.com/structured-data/testing-tool#url=https%3A%2F%2Feprints.soton.ac.uk%2F50995%2F
There's lots more work to polish this, but it's work showing off now.
I've used 3 citation files for this. One outer? one to handle the
different types. This is a bit ugly but was the solution I came up with,
a second one to process fields that come in a standard install of
EPrints, and a third for the fields eprints.soton has customised heavily.
In the main summary_page.xml I added:
? <epc:print expr="$item.citation('schema_org')" />
Which links to schema_org.xml:
<?xml version="1.0" ?>
<!DOCTYPE html SYSTEM "entities.dtd" >
<!--
??? Full "abstract page" (or splash page or summary page, depending on
your jargon) for an eprint.
-->
<cite:citation xmlns="http://www.w3.org/1999/xhtml"
xmlns:epc="http://eprints.org/ep3/control"
xmlns:cite="http://eprints.org/ep3/citation" >
<div style='display:none'>
? <epc:choose>
??? <epc:when test="type = 'article'">
????? <div itemscope="itemscope"
itemtype="http://schema.org/ScholarlyArticle">
??????? <epc:print expr="$item.citation('schema_org_main')" />
????? </div>
??? </epc:when>
??? <epc:when test="type = 'book'">
????? <div itemscope="itemscope" itemtype="http://schema.org/Book">
??????? <epc:print expr="$item.citation('schema_org_main')" />
????? </div>
??? </epc:when>
??? <!-- book_section -->
??? <epc:when test="type = 'conference_item'">
????? <div itemscope="itemscope"
itemtype="http://schema.org/ScholarlyArticle">
??????? <epc:print expr="$item.citation('schema_org_main')" />
????? </div>
??? </epc:when>
??? <epc:when test="type = 'monograph'">
????? <div itemscope="itemscope"
itemtype="http://schema.org/ScholarlyArticle">
??????? <epc:print expr="$item.citation('schema_org_main')" />
????? </div>
??? </epc:when>
??? <!-- patent -->
??? <epc:when test="type = 'thesis'">
????? <div itemscope="itemscope" itemtype="http://schema.org/Thesis">
??????? <epc:print expr="$item.citation('schema_org_main')" />
????? </div>
??? </epc:when>
??? <epc:when test="type = 'dataset'">
????? <div itemscope="itemscope" itemtype="http://schema.org/Dataset">
??????? <epc:print expr="$item.citation('schema_org_main')" />
????? </div>
??? </epc:when>
??? <!-- ad_item // art design item //? -->
??? <epc:when test="type = 'mu_item'">
????? <div itemscope="itemscope"
itemtype="http://schema.org/MusicComposition">
??????? <epc:print expr="$item.citation('schema_org_main')" />
????? </div>
??? </epc:when>
??? <!-- letter -->
??? <!-- editorial -->
??? <epc:when test="type = 'review'">
????? <div itemscope="itemscope" itemtype="http://schema.org/Review">
??????? <epc:print expr="$item.citation('schema_org_main')" />
????? </div>
??? </epc:when>
??? <!-- special_issue -->
??? <!-- meeting_abstract -->
??? <!-- software // SoftwareApplication/ SoftwareSourceCode ?? -->
??? <epc:when test="type = 'website'">
????? <div itemscope="itemscope" itemtype="http://schema.org/Website">
??????? <epc:print expr="$item.citation('schema_org_main')" />
????? </div>
??? </epc:when>
??? <epc:otherwise>
????? <div itemscope="itemscope" itemtype="http://schema.org/CreativeWork">
??????? <epc:print expr="$item.citation('schema_org_main')" />
????? </div>
??? </epc:otherwise>
? </epc:choose>
</div>
</cite:citation>
Each of these options in turn links to the main one,
schama_org_main.xml, that uses default EPrints fields:
<?xml version="1.0" ?>
<!DOCTYPE html SYSTEM "entities.dtd" >
<cite:citation xmlns="http://www.w3.org/1999/xhtml"
xmlns:epc="http://eprints.org/ep3/control"
xmlns:cite="http://eprints.org/ep3/citation" >
<div itemprop="name"><epc:print expr="title" /></div>
<div itemprop="headline"><epc:print expr="title" /></div>
<img itemprop="image"
src="http://www.eprints.org/uk/wp-content/uploads/EprintsServices2015icon.jpg"
/>
<epc:if test="abstract">
? <div itemprop="description"><epc:print expr="abstract" /></div>
</epc:if>
<epc:if test="keywords">
? <div itemprop="keywords"><epc:print expr="keywords" /></div>
</epc:if>
<epc:if test="isbn">
? <div itemprop="isbn"><epc:print expr="isbn" /></div>
</epc:if>
<epc:if test="id_number">
? <div itemprop="identifier"><epc:print expr="id_number" /></div>
</epc:if>
<epc:if test="issn or series">
? <div itemprop="isPartOf" itemscope="itemscope"
itemtype="http://schema.org/Periodical">
??? <epc:if test="issn"><div itemprop="issn"><epc:print expr="issn"
/></div></epc:if>
??? <epc:if test="series"><div itemprop="name"><epc:print expr="series"
/></div></epc:if>
? </div>
</epc:if>
<epc:comment>
? <!-- pageEnd and pageStart could go here but are more bother to
extract. -->
</epc:comment>
<epc:if test="pagerange">
? <div itemprop="pagination"><epc:print expr="as_string(pagerange)"
/></div>
</epc:if>
<epc:if test="publisher">
? <div itemprop="publisher" itemscope="itemscope"
itemtype="http://schema.org/Organization">
??? <div itemprop="name"><epc:print expr="publisher" /></div>
? </div>
</epc:if>
<epc:if test="official_url">
? <div itemprop="url"><epc:print expr="official_url" /></div>
</epc:if>
<epc:if test="creators">
? <epc:foreach expr="creators" iterator="person">
??? <div itemprop="creator" itemscope="itemscope"
itemtype="http://schema.org/Person">
????? <div itemprop="name"><epc:print
expr="$person.subproperty('name')" /></div>
????? <epc:if test="$person.subproperty('id')">
??????? <div itemprop="identifier"><epc:print
expr="$person.subproperty('id')" /></div>
????? </epc:if>
??? </div>
? </epc:foreach>
</epc:if>
<epc:if test="editors">
? <epc:foreach expr="editors" iterator="person">
??? <div itemprop="editor" itemscope="itemscope"
itemtype="http://schema.org/Person">
????? <div itemprop="name"><epc:print
expr="$person.subproperty('name')" /></div>
????? <epc:if test="$person.subproperty('id')">
??????? <div itemprop="identifier"><epc:print
expr="$person.subproperty('id')" /></div>
????? </epc:if>
??? </div>
? </epc:foreach>
</epc:if>
<epc:if test="corp_creators">
? <epc:foreach expr="corp_creators" iterator="org">
??? <div itemprop="creator" itemscope="itemscope"
itemtype="http://schema.org/Organization">
????? <div itemprop="name"><epc:print
expr="$person.subproperty('name')" /></div>
??? </div>
? </epc:foreach>
</epc:if>
<epc:comment>
? ADD IN LOCAL EXTENSIONS USING THIS FILE
</epc:comment>
<epc:print expr="$item.citation('schema_org_lcoal')" />
</cite:citation>
Finally I created schema_org_local.xml for the fields like date and
creators which we've heavily messed around with.
<?xml version="1.0" ?>
<!DOCTYPE html SYSTEM "entities.dtd" >
<!--
??? Local extra content for schema.org info on summary page.
??? This file can be used to add new fields that are not standard for
EPrints.
-->
<cite:citation xmlns="http://www.w3.org/1999/xhtml"
xmlns:epc="http://eprints.org/ep3/control"
xmlns:cite="http://eprints.org/ep3/citation" >
<epc:if test="dates">
? <epc:foreach expr="dates" iterator="date">
??? <epc:if test="$date.subproperty('date_type') = 'published'">
????? <div itemprop="datePublished"><epc:print
expr="$date.subproperty('date')" /></div>
??? </epc:if>
??? <epc:if test="$date.subproperty('date_type') = 'completed'">
????? <div itemprop="dateCompleted"><epc:print
expr="$date.subproperty('date')" /></div>
??? </epc:if>
? </epc:foreach>
</epc:if>
<epc:if test="contributors">
? <epc:foreach expr="contributors" iterator="person">
??? <div itemprop="contributor" itemscope="itemscope"
itemtype="http://schema.org/Person">
????? <div itemprop="name"><epc:print
expr="$person.subproperty('name')" /></div>
????? <epc:if test="$person.subproperty('id')">
??????? <div itemprop="identifier"><epc:print
expr="$person.subproperty('id')" /></div>
????? </epc:if>
??? </div>
? </epc:foreach>
</epc:if>
</cite:citation>
I'm not sure how useful all this is but figured I'd throw it out there.
It uses a default image as for some reason the Google checker insisted.
It doesn't link to files or mention subjects, doesn't include URIs
properly and doesn't link to ORCID etc. (which is data we have in
eprints.soton).
--
Christopher Gutteridge -- http://users.ecs.soton.ac.uk/cjg
University of Southampton Open Data Service: http://data.southampton.ac.uk/
You should read our Web & Data Innovation blog: http://blogs.ecs.soton.ac.uk/webteam/