[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[EP-tech] Experimental Schema.org support for EPrints



Yeah, that's something I considered, but I figured it's too much 
learning curve for people to manage in addition to all their other work. 
Schema files are easier. Making valid JSONLD requires a fair bit of 
progression up the learning curve.


On 21/11/2017 16:56, Lizz Jennings wrote:
> I did implement JSON-LD on the Bath Research Data Archive - won't necessarily translate to publications repos:
>
> https://github.com/eprintsug/json-ld
>
> Lizz
>
> --
> Lizz Jennings MSc MCLIP (Revalidated 2017)
> Developer
> Wessex House 4.16, University of Bath, Bath, BA2 7AY UK
> E.Jennings at bath.ac.uk
>
> -----Original Message-----
> From: eprints-tech-bounces at ecs.soton.ac.uk [mailto:eprints-tech-bounces at ecs.soton.ac.uk] On Behalf Of Christopher Gutteridge
> Sent: 21 November 2017 16:46
> To: eprints-tech at ecs.soton.ac.uk
> Subject: [EP-tech] Experimental Schema.org support for EPrints
>
> Hi, EPrints-tech, long time no-see.
>
> I've recently rejoined the EPrints.soton.ac.uk support team, and was asked about trying out schema.org support (which Google and Bing like).
> I'm not a huge fan as I like peer-to-peer data, rather than via the big search engines, but I gave it a go anyway.
>
> I have been working on a way to add schema.org support to EPrints. It's using an invisible <div> which may not be everyone's preferred way of doing it, but has the advantage of working well with the citation files.
>
> Other options would be to design the entire abstract page around this feature (possible, but work to add to existing sites) or use JSON-LD which is what I would do if I was doing it for just me, but making a configuration file to generate JSON-LD would be more work for me and more of a learning curve for the EPrints admin.
>
> I've added it as a pilot to https://eprints.soton.ac.uk/ (subject to removal or change at any time)
>
> See the data extracted from a page here:
> https://search.google.com/structured-data/testing-tool#url=https%3A%2F%2Feprints.soton.ac.uk%2F50995%2F
>
> There's lots more work to polish this, but it's work showing off now.
>
> I've used 3 citation files for this. One outer? one to handle the different types. This is a bit ugly but was the solution I came up with, a second one to process fields that come in a standard install of EPrints, and a third for the fields eprints.soton has customised heavily.
>
> In the main summary_page.xml I added:
>
>   ? <epc:print expr="$item.citation('schema_org')" />
>
> Which links to schema_org.xml:
>
> <?xml version="1.0" ?>
> <!DOCTYPE html SYSTEM "entities.dtd" >
>
> <!--
>   ??? Full "abstract page" (or splash page or summary page, depending on your jargon) for an eprint.
> -->
>
> <cite:citation xmlns="http://www.w3.org/1999/xhtml";
> xmlns:epc="http://eprints.org/ep3/control";
> xmlns:cite="http://eprints.org/ep3/citation"; >
>
> <div style='display:none'>
>   ? <epc:choose>
>   ??? <epc:when test="type = 'article'">
>   ????? <div itemscope="itemscope"
> itemtype="http://schema.org/ScholarlyArticle";>
>   ??????? <epc:print expr="$item.citation('schema_org_main')" />
>   ????? </div>
>   ??? </epc:when>
>   ??? <epc:when test="type = 'book'">
>   ????? <div itemscope="itemscope" itemtype="http://schema.org/Book";>
>   ??????? <epc:print expr="$item.citation('schema_org_main')" />
>   ????? </div>
>   ??? </epc:when>
>   ??? <!-- book_section -->
>   ??? <epc:when test="type = 'conference_item'">
>   ????? <div itemscope="itemscope"
> itemtype="http://schema.org/ScholarlyArticle";>
>   ??????? <epc:print expr="$item.citation('schema_org_main')" />
>   ????? </div>
>   ??? </epc:when>
>   ??? <epc:when test="type = 'monograph'">
>   ????? <div itemscope="itemscope"
> itemtype="http://schema.org/ScholarlyArticle";>
>   ??????? <epc:print expr="$item.citation('schema_org_main')" />
>   ????? </div>
>   ??? </epc:when>
>   ??? <!-- patent -->
>   ??? <epc:when test="type = 'thesis'">
>   ????? <div itemscope="itemscope" itemtype="http://schema.org/Thesis";>
>   ??????? <epc:print expr="$item.citation('schema_org_main')" />
>   ????? </div>
>   ??? </epc:when>
>   ??? <epc:when test="type = 'dataset'">
>   ????? <div itemscope="itemscope" itemtype="http://schema.org/Dataset";>
>   ??????? <epc:print expr="$item.citation('schema_org_main')" />
>   ????? </div>
>   ??? </epc:when>
>   ??? <!-- ad_item // art design item //? -->
>   ??? <epc:when test="type = 'mu_item'">
>   ????? <div itemscope="itemscope"
> itemtype="http://schema.org/MusicComposition";>
>   ??????? <epc:print expr="$item.citation('schema_org_main')" />
>   ????? </div>
>   ??? </epc:when>
>   ??? <!-- letter -->
>   ??? <!-- editorial -->
>   ??? <epc:when test="type = 'review'">
>   ????? <div itemscope="itemscope" itemtype="http://schema.org/Review";>
>   ??????? <epc:print expr="$item.citation('schema_org_main')" />
>   ????? </div>
>   ??? </epc:when>
>   ??? <!-- special_issue -->
>   ??? <!-- meeting_abstract -->
>   ??? <!-- software // SoftwareApplication/ SoftwareSourceCode ?? -->
>   ??? <epc:when test="type = 'website'">
>   ????? <div itemscope="itemscope" itemtype="http://schema.org/Website";>
>   ??????? <epc:print expr="$item.citation('schema_org_main')" />
>   ????? </div>
>   ??? </epc:when>
>
>   ??? <epc:otherwise>
>   ????? <div itemscope="itemscope" itemtype="http://schema.org/CreativeWork";>
>   ??????? <epc:print expr="$item.citation('schema_org_main')" />
>   ????? </div>
>   ??? </epc:otherwise>
>   ? </epc:choose>
> </div>
>
> </cite:citation>
>
> Each of these options in turn links to the main one,
> schama_org_main.xml, that uses default EPrints fields:
>
> <?xml version="1.0" ?>
> <!DOCTYPE html SYSTEM "entities.dtd" >
>
> <cite:citation xmlns="http://www.w3.org/1999/xhtml";
> xmlns:epc="http://eprints.org/ep3/control";
> xmlns:cite="http://eprints.org/ep3/citation"; >
>
> <div itemprop="name"><epc:print expr="title" /></div>
> <div itemprop="headline"><epc:print expr="title" /></div>
> <img itemprop="image"
> src="http://www.eprints.org/uk/wp-content/uploads/EprintsServices2015icon.jpg";
> />
> <epc:if test="abstract">
>   ? <div itemprop="description"><epc:print expr="abstract" /></div>
> </epc:if>
> <epc:if test="keywords">
>   ? <div itemprop="keywords"><epc:print expr="keywords" /></div>
> </epc:if>
> <epc:if test="isbn">
>   ? <div itemprop="isbn"><epc:print expr="isbn" /></div>
> </epc:if>
> <epc:if test="id_number">
>   ? <div itemprop="identifier"><epc:print expr="id_number" /></div>
> </epc:if>
>
> <epc:if test="issn or series">
>   ? <div itemprop="isPartOf" itemscope="itemscope"
> itemtype="http://schema.org/Periodical";>
>   ??? <epc:if test="issn"><div itemprop="issn"><epc:print expr="issn"
> /></div></epc:if>
>   ??? <epc:if test="series"><div itemprop="name"><epc:print expr="series"
> /></div></epc:if>
>   ? </div>
> </epc:if>
>
> <epc:comment>
>   ? <!-- pageEnd and pageStart could go here but are more bother to
> extract. -->
> </epc:comment>
>
> <epc:if test="pagerange">
>   ? <div itemprop="pagination"><epc:print expr="as_string(pagerange)"
> /></div>
> </epc:if>
> <epc:if test="publisher">
>   ? <div itemprop="publisher" itemscope="itemscope"
> itemtype="http://schema.org/Organization";>
>   ??? <div itemprop="name"><epc:print expr="publisher" /></div>
>   ? </div>
> </epc:if>
> <epc:if test="official_url">
>   ? <div itemprop="url"><epc:print expr="official_url" /></div>
> </epc:if>
>
> <epc:if test="creators">
>   ? <epc:foreach expr="creators" iterator="person">
>   ??? <div itemprop="creator" itemscope="itemscope"
> itemtype="http://schema.org/Person";>
>   ????? <div itemprop="name"><epc:print
> expr="$person.subproperty('name')" /></div>
>   ????? <epc:if test="$person.subproperty('id')">
>   ??????? <div itemprop="identifier"><epc:print
> expr="$person.subproperty('id')" /></div>
>   ????? </epc:if>
>   ??? </div>
>   ? </epc:foreach>
> </epc:if>
> <epc:if test="editors">
>   ? <epc:foreach expr="editors" iterator="person">
>   ??? <div itemprop="editor" itemscope="itemscope"
> itemtype="http://schema.org/Person";>
>   ????? <div itemprop="name"><epc:print
> expr="$person.subproperty('name')" /></div>
>   ????? <epc:if test="$person.subproperty('id')">
>   ??????? <div itemprop="identifier"><epc:print
> expr="$person.subproperty('id')" /></div>
>   ????? </epc:if>
>   ??? </div>
>   ? </epc:foreach>
> </epc:if>
>
> <epc:if test="corp_creators">
>   ? <epc:foreach expr="corp_creators" iterator="org">
>   ??? <div itemprop="creator" itemscope="itemscope"
> itemtype="http://schema.org/Organization";>
>   ????? <div itemprop="name"><epc:print
> expr="$person.subproperty('name')" /></div>
>   ??? </div>
>   ? </epc:foreach>
> </epc:if>
>
>
> <epc:comment>
>   ? ADD IN LOCAL EXTENSIONS USING THIS FILE
> </epc:comment>
> <epc:print expr="$item.citation('schema_org_lcoal')" />
>
> </cite:citation>
>
> Finally I created schema_org_local.xml for the fields like date and
> creators which we've heavily messed around with.
>
> <?xml version="1.0" ?>
> <!DOCTYPE html SYSTEM "entities.dtd" >
>
> <!--
>   ??? Local extra content for schema.org info on summary page.
>
>   ??? This file can be used to add new fields that are not standard for
> EPrints.
> -->
>
> <cite:citation xmlns="http://www.w3.org/1999/xhtml";
> xmlns:epc="http://eprints.org/ep3/control";
> xmlns:cite="http://eprints.org/ep3/citation"; >
>
> <epc:if test="dates">
>   ? <epc:foreach expr="dates" iterator="date">
>   ??? <epc:if test="$date.subproperty('date_type') = 'published'">
>   ????? <div itemprop="datePublished"><epc:print
> expr="$date.subproperty('date')" /></div>
>   ??? </epc:if>
>   ??? <epc:if test="$date.subproperty('date_type') = 'completed'">
>   ????? <div itemprop="dateCompleted"><epc:print
> expr="$date.subproperty('date')" /></div>
>   ??? </epc:if>
>   ? </epc:foreach>
> </epc:if>
>
> <epc:if test="contributors">
>   ? <epc:foreach expr="contributors" iterator="person">
>   ??? <div itemprop="contributor" itemscope="itemscope"
> itemtype="http://schema.org/Person";>
>   ????? <div itemprop="name"><epc:print
> expr="$person.subproperty('name')" /></div>
>   ????? <epc:if test="$person.subproperty('id')">
>   ??????? <div itemprop="identifier"><epc:print
> expr="$person.subproperty('id')" /></div>
>   ????? </epc:if>
>   ??? </div>
>   ? </epc:foreach>
> </epc:if>
>
> </cite:citation>
>
>
> I'm not sure how useful all this is but figured I'd throw it out there.
> It uses a default image as for some reason the Google checker insisted.
> It doesn't link to files or mention subjects, doesn't include URIs
> properly and doesn't link to ORCID etc. (which is data we have in
> eprints.soton).
>
>
>

-- 
Christopher Gutteridge -- http://users.ecs.soton.ac.uk/cjg

University of Southampton Open Data Service: http://data.southampton.ac.uk/
You should read our Web & Data Innovation blog: http://blogs.ecs.soton.ac.uk/webteam/