EPrints Technical Mailing List Archive

Message: #03392


< Previous (by date) | Next (by date) > | < Previous (in thread) | Next (in thread) > | Messages - Most Recent First | Threads - Most Recent First

[EP-tech] Antwort: Re: About IRStats2


Hi,

it looks like irstats2 has a flaw in gathering author data. From our statistics on our test server zoratest (irstats2 is not yet deployed on the production server www.zora.uzh.ch) we get the following picture:

Most Downloaded Items:

count,eprintid,description
="10077",="10147","<a href=''>Body Modification: psychologische Aspekte von Piercings und anderen Körperveränderungen</a>"
="5958",="2532","<a href=''>Welpenfütterung in der Schweiz</a>"
="5204",="43064","<a href=''>Extraartikuläre weichteilrheumatische Erkrankungen (Weichteilrheumatismus) und Rückenschmerzen</a>"
="4956",="24050","<a href=''>Traumatic pericarditis in cattle: clinical, radiographic and ultrasonographic findings</a>"
="4539",="19506","<a href=''>IFRS aktuell: Neues aus wichtigen Gremien rund um die internationale Rechnungslegung</a>"

Top Authors:

count,set_value,description
="9710",90de69aa75e88bae17a48fe111738757,"Zweifel, Peter"
="9170",a8919638b6af7edf5e6201132093647d,"Schwabe, Gerhard"
="8944",8d05f2d4d6fa6596e6b676723b5082c8,"Fehr, Ernst"
="8381",f3f5e1a2127f50a31435b5fb54d2bef9,"Deplazes, P"
="8289",6f95d542bd0c4b9760811f454004c76c,"Linden, A"


You see immediately that this is plain wrong, because the top author, "Kälin, R" who published eprintid 10147 (see http://www.zora.uzh.ch/10147/) isn't on the list of top authors and should there have a count of 10077 downloads.

Kälin, R also doesn't appear in the Filter Items list of irstats2.

Checking the SQL tables as Seb suggested yields:

mysql> select * from eprint_creators_name where eprintid=10147\G
*************************** 1. row ***************************
eprintid: 10147
pos: 0
creators_name_honourific:
creators_name_given: R
creators_name_family: K?lin
creators_name_lineage:
1 row in set (0.00 sec)


mysql> select * from eprint_creators_id where eprintid=10147\G
Empty set (0.00 sec)


Another eprint indeed lists entries in the eprint_creators_id table:

mysql> select * from eprint_creators_id where eprintid=13208;
+----------+-----+------------------------------+
| eprintid | pos | creators_id                  |
+----------+-----+------------------------------+
|    13208 |   0 | mjackson@vetclinics.uzh.ch   |
|    13208 |   1 |                              |
|    13208 |   2 | jkuemmerle@vetclinics.uzh.ch |
|    13208 |   3 | afuerst@vetclinics.uzh.ch    |
+----------+-----+------------------------------+


Conclusion: irstats2 seems to gather author statistics only correctly, if there is creators_id entry (at least an e-mail address set) in table eprint_creators_id . Also it seems to produce a filter list entry only, if there is a corresponding entry in table eprint_creators_id.

Irstats2 authors, please correct this wrong behavior.

Best regards,

Martin

--
Dr. Martin Brändle
Informatikdienste
Universität Zürich
Winterthurerstr. 190
CH-8057 Zürich

mail: martin.braendle@id.uzh.ch
phone: +41 44 63 56705
fax: +41 44 63 54505
http://www.id.uzh.ch

Inactive hide details for Sebastien Francois ---25/07/2014 15:49:42---And you do have creators data?  (select * from eprint_creSebastien Francois ---25/07/2014 15:49:42---And you do have creators data?  (select * from eprint_creators_name  ---and/or--- select * from epri

Von: Sebastien Francois <sf2@ecs.soton.ac.uk>
An: eprints-tech@ecs.soton.ac.uk
Datum: 25/07/2014 15:49
Betreff: [EP-tech] Re: About IRStats2
Gesendet von: eprints-tech-bounces@ecs.soton.ac.uk





And you do have creators data?  (select * from eprint_creators_name ---and/or--- select * from eprint_creators_id)

Cos that's where irstats2 tries to process the data from.

Seb.

On 25/07/14 13:07, pgasinos pgs wrote:
    Yes  
    My repository is:
    http://anaktisis.teiwm.gr 

    Kostas Pgasinos

    Στις Παρασκευή, 25 Ιουλίου 2014, ο χρήστης Sebastien Francois <
    sf2@ecs.soton.ac.uk> έγραψε:
      Hey,

      Do you have a URL I can look at?

      It seems like there are some issues with your data (the "countries' does not exist" error indicates some issues with Geo::IP). Do you get any related errors/warnings when you run "bin/epadmin test"?

      If that were possible, I'd re-generate all the stats:

      bin/stats/process_stats <id> --uninstall

      then

      bin/stats/process_stats <id> --setup --verbose

      As you know, this may take some time (depending on the size of your 'access' dataset).

      Kind regards,
      Seb
*** Options: http://mailman.ecs.soton.ac.uk/mailman/listinfo/eprints-tech
*** Archive:
http://www.eprints.org/tech.php/
*** EPrints community wiki:
http://wiki.eprints.org/
*** EPrints developers Forum:
http://forum.eprints.org/