EPrints Technical Mailing List Archive

Message: #01579


< Previous (by date) | Next (by date) > | < Previous (in thread) | Next (in thread) > | Messages - Most Recent First | Threads - Most Recent First

[EP-tech] Re: RFC access log table


Do OAI-PMH harvests appear in the access table?


-----Original Message-----
From: eprints-tech-bounces@ecs.soton.ac.uk [mailto:eprints-tech-bounces@ecs.soton.ac.uk] On Behalf Of Alan.Stiles
Sent: 15 February 2013 10:30
To: 'eprints-tech@ecs.soton.ac.uk'
Subject: [EP-tech] Re: RFC access log table

Hi Tim,

Having a quick look through the access table, it might also be nice if there was the option to include / exclude a list of known robots and spiders from the csv dumps, and possibly just to strip them from the access table outside of the dumps, keeping it to a more manageable size without losing 'relevant' information - Bing and Yandex appear to be among our worst offenders.

Alan.

-----Original Message-----
From: Tim Brody [mailto:tdb2@ecs.soton.ac.uk] 
Sent: 15 February 2013 09:32
To: eprints-tech@ecs.soton.ac.uk
Subject: [EP-tech] Re: RFC access log table

Hi,

Yes, there is nothing in the core that relies on data in access*. The
IRStats 1 & 2 use access to create their summary data.

It looks like the best solution is to provide a tool to periodically dump
historic access data to files, but that it is still useful to keep
"current" (defined by config) data in the database.

All the best,
Tim.

On Fri, 15 Feb 2013 08:13:52 +0100, Yuri <yurj@alfa.it> wrote:
> We've a test server which is a clone of the production server. Can I 
> empty those access tables safely to save space? :) can I do an "delete * 
> from access" without any issue? The same for access__ordervalues_en and 
> all the languages?
> 
> Il 15/02/2013 03:13, Mark Gregson ha scritto:
>> Hi Tim
>>
>> Because of the DB backup issues we invested some time a while ago in
some
>> scripts for archiving the access data off to monthly dumps and for
>> restoring it (if required, say be the need to have IRStats reprocess all
>> data). These scripts are not actually in production use because I
haven't
>> had time to test it to my satisfaction (sorry Nick!).
>>
>> CSV is a more accessible format than a MySQL dump, which may be a
>> benefit.
>>
>> We are using IRStats for statistics which uses the access table but I
>> guess this will be easily updated with a new parser. We also do some
>> custom logging to the access table for reporting on outbound link clicks
>> via IRStats.  This logging is handled via EPrints::Apache::LogHandler.
>>
>> Cheers
>> Mark
>>
>>
>> -----Original Message-----
>> From: eprints-tech-bounces@ecs.soton.ac.uk
>> [mailto:eprints-tech-bounces@ecs.soton.ac.uk] On Behalf Of Tim Brody
>> Sent: Thursday, 14 February 2013 8:01 PM
>> To: eprints-tech@ecs.soton.ac.uk
>> Subject: [EP-tech] RFC access log table
>>
>> Hi All,
>>
>> I'm thinking about the access log table and how it can be made
>> sustainable.
>>
>> What I'm suggesting is to write accesses to CSV-formatted log files, one
>> file per month. What I don't know is whether anyone is relying on the
>> database table for generating statistics?
>>
>> The problem the access log table creates is in backing-up the EPrints
>> database.
>>
>> I'd appreciate any thoughts/comments.
>>
>> --
>> All the best,
>> Tim
>>
>> *** Options:
http://mailman.ecs.soton.ac.uk/mailman/listinfo/eprints-tech
>> *** Archive: http://www.eprints.org/tech.php/
>> *** EPrints community wiki: http://wiki.eprints.org/
> 
> *** Options: http://mailman.ecs.soton.ac.uk/mailman/listinfo/eprints-tech
> *** Archive: http://www.eprints.org/tech.php/
> *** EPrints community wiki: http://wiki.eprints.org/

-- 
All the best,
Tim.
*** Options: http://mailman.ecs.soton.ac.uk/mailman/listinfo/eprints-tech
*** Archive: http://www.eprints.org/tech.php/
*** EPrints community wiki: http://wiki.eprints.org/

-- 
The Open University is incorporated by Royal Charter (RC 000391), an exempt charity in England & Wales and a charity registered in Scotland (SC 038302).


*** Options: http://mailman.ecs.soton.ac.uk/mailman/listinfo/eprints-tech
*** Archive: http://www.eprints.org/tech.php/
*** EPrints community wiki: http://wiki.eprints.org/