EPrints Technical Mailing List Archive

Message: #01318

< Previous (by date) | Next (by date) > | < Previous (in thread) | Next (in thread) > | Messages - Most Recent First | Threads - Most Recent First

[EP-tech] Exporting a .zip file

EPrints 3.2.2 on Solaris

I'm writing an exporter to download the item as a .zip file (as in, a file that can be used in a SWORD deposit.)

I can create the .zip file; I can send something called foo.zip back to the user.... but the thing I get back is not readable as a .zip file!

Here is the code (with most of the integrity-checking & sanity-checking taken out)

---------------- code -----------------

package EPrints::Plugin::Export::SWORD_Deposit_File;

use strict;
use warnings;

use Archive::Zip qw( :ERROR_CODES :CONSTANTS);

use EPrints::Plugin::Export::METS_Broker;
our @ISA = qw( EPrints::Plugin::Export::METS_Broker );

our $PREFIX = '';

sub new
    my ($class, %opts) = @_;

    my $self = $class->SUPER::new(%opts);

    $self->{name}    = "foo";
    $self->{accept}  = ['dataobj/eprint', 'list/eprint'];
    $self->{visible} = "all";

    $self->{xmlns}          = "foo";
    $self->{schemaLocation} = "foo2";
    $self->{suffix}         = ".zip";
    $self->{mimetype}       = 'application/zip';

    return $self;
} ## end sub new

sub output_dataobj
    my ($plugin, $dataobj) = @_;

    my $text;
    $text = <<EOX;
<?xml version="1.0" encoding="utf-8" ?>


    my $xml = $plugin->xml_dataobj($dataobj);
    $text .= EPrints::XML::to_string($xml);

    # Use Archive::Zip to create the zip file
    my $archive = Archive::Zip->new();

    # Add the manifest as mets.xml

    # for some reason, I can't get Archive::Zip to add a file with
    # unicode characters, so this is a nasty hack, sorry
    my $tmpdir = EPrints::TempDir->new(UNLINK => 0);
    my $tmpfile = "${tmpdir}/mets.xml";
warn "tempfile: $tmpfile\n";

    # write out the xml file
    my $FH;
    open($FH, ">:utf8", "$tmpfile");
    print $FH $text;
    $archive->addFile($tmpfile, "mets.xml");

    # end of fudge

    # Each 'Document' is represented by a directory
    # Each 'Document' many have a number of files
    foreach my $doc ($dataobj->get_all_documents)

        # In the zip file, we use file paths rather than URLs, however
        # this routine is very similar to the methodology in
        # EPrints::Plugin::Export::METS_Broker::_make_fileSec

        # the docpath is the path to the document on the disk,
        # formatted:
        # /<EprintsRoot>/archive/<archiveID>/documents/<somepath>/<doc_id>
        my $docroot = $doc->local_path;
        $docroot =~ /\/(\d+)$/;
        my $doc_id = $doc->get_id;

        $archive->addTree($docroot, $doc_id);
     } ## end foreach my $doc ($dataobj->get_all_documents...)

my @M = $archive->memberNames();
warn "Archive: @M\n";
    # as of Perl 5.8.0, we can open a filehandle to variables
    my $buffer = "";
open(my $fh, ">:bytes", \$buffer) || warn "Can't open a buffer in memory\n";

    unless ($archive->writeToFileHandle($fh) == AZ_OK)
        warn "failed to write archive into scalar\n";
    close $fh;
warn "Written archive into scalar\n";
warn "size:". length($buffer)."\n";

   return $buffer;
} ## end sub output_dataobj

---------------- code -----------------

As you can see, I've got some debugging going on

tempfile: /var/tmp/n2grhkaewB/mets.xml
Archive: mets.xml 474/ 474/2012-09-06767_121010093659.zip
Written archive into scalar

and looking at the .zip file:
[broker@devel:Export]: unzip -l /var/tmp/n2grhkaewB/deposit.zip
Archive:  /var/tmp/n2grhkaewB/deposit.zip
  Length     Date   Time    Name
 --------    ----   ----    ----
     4206  11-20-12 09:29   mets.xml
        0  10-10-12 14:37   474/
     1178  10-10-12 14:37   474/2012-09-06767_121010093659.zip
 --------                   -------
     5384                   3 files

So.... I can obviously MAKE the file.... I just can't send it back to the user sensibly.

Anyone done this?
If not - anyone any suggestions as to why it's not working?

Oh - and a SWORD transfer exporter is fine: it's essentially this code, but then uses LWP::UserAgent to do a post:

  my $ua      = LWP::UserAgent->new();
  my $rxml    = $plugin->{session}->xml;
  my $returns = $rxml->create_element("returns");
  my $req = HTTP::Request->new(POST => $target->{uri});

  # Tell SWORD to process the contents of the zip file as my format
           'X-Packaging'         => 'http://opendepot.org/broker/1.0',
           'X-No-Op'             => 'false',
           'X-Verbose'           => 'false',
           'Content-Disposition' => 'filename=sword_deposit.zip',
           'filename'            => 'sword_deposit.zip',
  my $res = $ua->request($req);
  if ($res->is_success)
     my $content_dom = $rxml->parse_string($res->content);
     my $fragment = $content_dom->getElementsByTagName('entry')->item(0);
  } ## end if ($res->is_success)


Ian Stuart.
Developer: ORI, RJ-Broker, and OpenDepot.org
Bibliographics and Multimedia Service Delivery team,
The University of Edinburgh.


This email was sent via the University of Edinburgh.

The University of Edinburgh is a charitable body, registered in
Scotland, with registration number SC005336.