EPrints Technical Mailing List Archive

Message: #04102


< Previous (by date) | Next (by date) > | < Previous (in thread) | Next (in thread) > | Messages - Most Recent First | Threads - Most Recent First

[EP-tech] Manual replication, LAN -> DMZ


Currently we run Ubuntu 12.04LTS/eprints 3.2.2 in the DMZ – all changes are made directly there (only by our employees), but all of the documents are world-accessible.

 

I’d like to split it into two components – a full eprints repository running inside the LAN, where our authorized employees would continue to make edits, with nightly batch updates to the public server in the DMZ. Ideally the end result would  allow a complete, ground-up rebuild of the DMZ version were it ever to get hacked / damaged.

 

I plan on using rsync for the filesystem, preserving all timestamps & permissions, and don’t forsee any problems there.

 

For the database, I don’t want to use mysqldbcomare, because it syncs the entire database, and I want a crippled set of “user*” tables on the DMZ machine to thwart logging in.

Instead, Percona pt-table-sync looks like the right tool, except it chokes on “eprint__rindex” (not ideal for the algorithms used by pt-table-sync). Excluding that table from the synchronization, everything else completes in a few seconds (if no changes are to be made).

 

Q1: If I run the indexer service on the DMZ box, will that correctly build “eprint__rindex”? Does the indexer “know” when files exist which need to be indexed, or must it be “told” by the eprints program to index new items?

 

Q2: I assume I don’t want to copy “access” & “access__ordervalues_en”,  or the “cache*” tables. What other tables do I not need to copy, or would be detrimental if I did?

 

Thanks!

Dan Stieneke

IT Specialist

USDA - ARS - NWISRL

3793 N 3600 E

Kimberly, ID 83341

208/423-6519