EPrints Technical Mailing List Archive

Message: #03228


< Previous (by date) | Next (by date) > | < Previous (in thread) | Next (in thread) > | Messages - Most Recent First | Threads - Most Recent First

[EP-tech] Re: OAI-PMH


Am 04.07.2014 15:34, schrieb Ian Stuart:
But have a good look at what the
http://<your.server.address/cgi/oai2?verb=ListSets returns (and there's
a reason for those bloody horrible SetNames)

Hi Ian,

For special needs besides of what can be done with the advanced search, EPrints should support partially dynamic but human-readable set names as defined in the configuration. We have OAI clients that *expect* set names composed in a certain way, e.g. containing literal field values as in "ddc:300" or "doc-type:article".

We must use a hack here to include a list of kind-of-automatically pregenerated custom sets into the configuration. I would humbly prefer some better (e.g. out of the box) method to achieve that, admittedly:


# in cfg.d/oai.pl:
use UBHD::EPrints::DINISets2010 qw(cfg_setspecs);

$oai->{custom_sets} = [
        cfg_setspecs(),
      # ...
];

# in a .pm file residing in an %INC directory:

package UBHD::EPrints::DINISets2010;
use base 'Exporter';

our @EXPORT_OK = qw( sets_for_eprint cfg_setspecs );

my %fields = (
   'ddc' => {
spec => sub {{ filters => [{ meta_fields => [ "subjects" ], value => shift }] }}, eprint => sub { my $ep = shift; return map { "ddc:$_" } @{$ep->value("subjects")}; }
   },
   'doc-type' => {
spec => sub {{ filters => [{ meta_fields => [ "type" ], value => shift }] }}, eprint => sub { my $ep = shift; return "doc-type:".$ep->value("type") },
   },
);

my %sets;

while ( defined($_ = <DATA>) ) {
    chomp; ($_) = split /\s*#/; next if !$_;
my ($setSpec, $type, $value, $name) = m{ \A (([\w-]+):(\S+)) \s (.+) \z }xms
        or die "Odd format in line!";
    my $spec = $sets{$setSpec} = $fields{$type}{spec}->($value);
    $spec->{spec} = $setSpec;
    $spec->{name} = $name;
}
close DATA;

sub cfg_setspecs { values %sets }

sub sets_for_eprint {
    my ($eprint) = @_;
    return map { $_->{eprint}->($eprint) } values %fields;
}

1;

__DATA__
ddc:000 Generalities, Science
ddc:004 Data processing Computer science
ddc:010 Bibliography
...
doc-type:preprint Preprint
doc-type:workingPaper WorkingPaper
doc-type:article Article
doc-type:PeriodicalPart PeriodicalPart
doc-type:Periodical Periodical
doc-type:book Book
...


Florian

--
UB Heidelberg (Altstadt)
Plöck 107-109, 69117 HD
Abt. Informationstechnik
http://www.ub.uni-heidelberg.de/