[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[EP-tech] Eprints Log Analysis

I am trying to make sense of an eprints instance apache log file and
was wondering if there is an easier way of identifying the following
items by analysing the 'request line' in the log files

1. item deposits
2. item abstract or full metadata edits/access
3. oai-pmh/sword interaction from external sources

As an initial step, I've managed to figure out that to solve items 1&
2, I would have to filter out entries resulting from authenticated
sessions (basically checking if field 4 in log entry is NOT hyphen).

A sample entry from a log file from eprints instance I am analysing is below.

***** Sample Log Entry *****
[IP] [IP] - userid [29/Jan/2010:01:13:18 +0200] "GET
/perl/users/record HTTP/1.1" 200 5531
"http://pubs.cs.uct.ac.za/perl/users/home"; "Mozilla/5.0 (Windows; U;
Windows NT 6.1; en-GB; rv: Gecko/20091221 Firefox/3.5.7 (.NET
CLR 3.5.30729)"