[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[EP-tech] Re: RFC access log table



Il 15/02/2013 10:59, Tim Brody ha scritto:
> IRStats builds summary tables, so it doesn't need the data once the
> processing has been run.
>
> But ... unless you are really tight on space

bigger tables can be a problem or nothing to worry about?

>   I would always keep the
> original data. If you have Apache logs you could reverse-engineer them into
> the access-log equivalent (URL matching).

I would keep them until possible :)

>
> /Tim.
>
> On Fri, 15 Feb 2013 10:59:15 +0100, Yuri<yurj at alfa.it>  wrote:
>> Great! 5,9 GB freed :-)
>>
>> IRStats use them once or they need them always?
>>
>> Il 15/02/2013 10:32, Tim Brody ha scritto:
>>> Hi,
>>>
>>> Yes, there is nothing in the core that relies on data in access*. The
>>> IRStats 1&   2 use access to create their summary data.
>>>
>>> It looks like the best solution is to provide a tool to periodically
> dump
>>> historic access data to files, but that it is still useful to keep
>>> "current" (defined by config) data in the database.
>>>
>>> All the best,
>>> Tim.
>>>
>>> On Fri, 15 Feb 2013 08:13:52 +0100, Yuri<yurj at alfa.it>   wrote:
>>>> We've a test server which is a clone of the production server. Can I
>>>> empty those access tables safely to save space? :) can I do an "delete
> *
>>>> from access" without any issue? The same for access__ordervalues_en and
>>>> all the languages?
>>>>
>>>> Il 15/02/2013 03:13, Mark Gregson ha scritto:
>>>>> Hi Tim
>>>>>
>>>>> Because of the DB backup issues we invested some time a while ago in
>>> some
>>>>> scripts for archiving the access data off to monthly dumps and for
>>>>> restoring it (if required, say be the need to have IRStats reprocess
>>>>> all
>>>>> data). These scripts are not actually in production use because I
>>> haven't
>>>>> had time to test it to my satisfaction (sorry Nick!).
>>>>>
>>>>> CSV is a more accessible format than a MySQL dump, which may be a
>>>>> benefit.
>>>>>
>>>>> We are using IRStats for statistics which uses the access table but I
>>>>> guess this will be easily updated with a new parser. We also do some
>>>>> custom logging to the access table for reporting on outbound link
>>>>> clicks
>>>>> via IRStats.  This logging is handled via EPrints::Apache::LogHandler.
>>>>>
>>>>> Cheers
>>>>> Mark
>>>>>
>>>>>
>>>>> -----Original Message-----
>>>>> From: eprints-tech-bounces at ecs.soton.ac.uk
>>>>> [mailto:eprints-tech-bounces at ecs.soton.ac.uk] On Behalf Of Tim Brody
>>>>> Sent: Thursday, 14 February 2013 8:01 PM
>>>>> To: eprints-tech at ecs.soton.ac.uk
>>>>> Subject: [EP-tech] RFC access log table
>>>>>
>>>>> Hi All,
>>>>>
>>>>> I'm thinking about the access log table and how it can be made
>>>>> sustainable.
>>>>>
>>>>> What I'm suggesting is to write accesses to CSV-formatted log files,
>>>>> one
>>>>> file per month. What I don't know is whether anyone is relying on the
>>>>> database table for generating statistics?
>>>>>
>>>>> The problem the access log table creates is in backing-up the EPrints
>>>>> database.
>>>>>
>>>>> I'd appreciate any thoughts/comments.
>>>>>
>>>>> --
>>>>> All the best,
>>>>> Tim
>>>>>
>>>>> *** Options:
>>> http://mailman.ecs.soton.ac.uk/mailman/listinfo/eprints-tech
>>>>> *** Archive: http://www.eprints.org/tech.php/
>>>>> *** EPrints community wiki: http://wiki.eprints.org/
>>>> *** Options:
>>>> http://mailman.ecs.soton.ac.uk/mailman/listinfo/eprints-tech
>>>> *** Archive: http://www.eprints.org/tech.php/
>>>> *** EPrints community wiki: http://wiki.eprints.org/
>> *** Options: http://mailman.ecs.soton.ac.uk/mailman/listinfo/eprints-tech
>> *** Archive: http://www.eprints.org/tech.php/
>> *** EPrints community wiki: http://wiki.eprints.org/