[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[EP-tech] EPrints/Elements Merge Problems



Oh I see it. I was close(ish), I end up in the "sub remove" in
FileManager.pm.

That is more than enough to go on.

Thank you again!

James

On Fri, May 17, 2019 at 11:58 AM John Salter <J.Salter at leeds.ac.uk> wrote:

> Possibly/probably?
>
> There is a configuration option in one of the symplectic_XX_something.pl
> files that governs file deletion behaviour.
>
> I'd find that - and then search the code for references to it's value.
>
>
>
> Cheers,
>
> John
>
>
>
> *From:* James Kerwin [mailto:jkerwin2101 at gmail.com]
> *Sent:* 17 May 2019 11:52
> *To:* John Salter <J.Salter at leeds.ac.uk>
> *Cc:* eprints-tech at ecs.soton.ac.uk
> *Subject:* Re: [EP-tech] EPrints/Elements Merge Problems
>
>
>
> Thanks John!
>
>
>
> I'll make these changes today and hopefully never have this problem again.
> Good timing really because we had restrictions on merging items in Elements
> for a while and this has been lifted.
>
>
>
> Can I ask you a very quick question? Do you know if "FileManageHandler.pm"
> (Symplectic::Handlers::FileManageHandler) a good place to start
> investigating a problem with deletion requests from Elements? There's a
> "delete_handler" sub in there that looks like a likely candidate, but I
> don't want to spend half of my day looking in the wrong place.
>
>
>
> Thanks,
>
> James
>
>
>
> On Thu, May 16, 2019 at 12:48 PM John Salter <J.Salter at leeds.ac.uk> wrote:
>
> Hi James,
>
> I've put some notes/code here:
>
> https://eur03.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgist.github.com%2Fjesusbagpuss%2Fee27acd24a5d0e3fa3d29ef0075d921b&amp;data=01%7C01%7Ceprints-tech%40ecs.soton.ac.uk%7Cafb968dc2007425e172b08d6dab89ce4%7C4a5378f929f44d3ebe89669d03ada9d8%7C0&amp;sdata=JS7DCSp1Y%2BfpxAVUEVzCIgwd3m1VQ%2FEghAGDrYw8c%2B4%3D&amp;reserved=0
>
> let me know if it doesn't make sense.
>
>
>
> From my comments in the code (might be useful knowledge for others):
>
> EPrints' default behaviour is to remove the 'pos' during a document clone
> *only* when the doc is being cloned to the same parent.
>
>
>
> Cheers,
>
> John
>
>
>
> *From:* eprints-tech-bounces at ecs.soton.ac.uk [mailto:
> eprints-tech-bounces at ecs.soton.ac.uk] *On Behalf Of *James Kerwin via
> Eprints-tech
> *Sent:* 16 May 2019 12:22
> *To:* John Salter <J.Salter at leeds.ac.uk>
> *Cc:* eprints-tech at ecs.soton.ac.uk
> *Subject:* Re: [EP-tech] EPrints/Elements Merge Problems
>
>
>
> He David and John,
>
>
>
> I've taken a look at the symplectic_merge and symplectic_pids table, and
> using by budding skills of divination I couldn't find anything out of sorts
> in there - so far as I can tell. Although in trying to fix this with
> multiple re-deposits this one record has a load of different EPrints IDs.
>
>
>
> It appears to be as John said. I've managed to clean the record up by
> fiddling with the document and file tables and moving some files around.
> Everything appears to be working...
>
>
>
> John, thank you for the piece of SQL. It appears this is isn't a huge
> problem, but there are other instances that I'm now at least aware of. I'll
> put a ticket in with Symplectic. If you do have a solution it would be
> brilliant, but out of principle I think Symplectic should provide the
> solution to their customers. Capitalists shouldn't be depending on an open
> source group to fix their code. Anyway, I don't want to get too political...
>
>
>
> Thank you both for your help.
>
>
>
> Thanks,
>
> James
>
>
>
> On Thu, May 16, 2019 at 9:49 AM John Salter <J.Salter at leeds.ac.uk> wrote:
>
> Hi James,
> Yes - and I've submitted a bug report to Symplectic - with a fix for their
> connector - which they've never rolled out *sigh*.
>
> When two items are merged, and initially they both have documents in
> folder '01', all these then get put into the 'surviving EPrint's '01'
> directory.
> Worse still, if both items have a file of the same name e.g. Doc1.pdf (but
> they are different files), one of them will overwrite the other - and you
> have data-loss :o|
>
> You can see how many things are affected by this with the following query:
> SELECT   eprintid, pos, count(*) as c FROM   document GROUP BY   eprintid,
> pos HAVING c > 1;
>
> The issue is in Symplectic/RepoProcess/MergeManager.pm - and a call to:
>         my $new_doc = $doc->clone($target);
> This clone doesn't reset the 'pos' - so you get the results reported.
>
> I'll put a gist together with the changes needed to resolve this.
>
> Feel free to log it as a ticket with Symplectic...
>
> Cheers,
> John
>
> -----Original Message-----
> From: eprints-tech-bounces at ecs.soton.ac.uk [mailto:
> eprints-tech-bounces at ecs.soton.ac.uk] On Behalf Of Newman D.R. via
> Eprints-tech
> Sent: 16 May 2019 09:28
> To: eprints-tech at ecs.soton.ac.uk; James Kerwin <jkerwin2101 at gmail.com>
> Subject: Re: [EP-tech] EPrints/Elements Merge Problems
>
> Hi James,
>
> Based on some experience with this it can be due to the symplectic_pids
> table in EPrints getting out of sync.  I have in the past had to make
> manual corrections to fix this.  It has never been clear to me what
> caused the issue, as I only have access to the EPrints side.
>
> I would take a look in this EPrints database table.  It has three IDs
> two from Symplectic and one from EPrints.  A record with no outstanding
> merge issues should have the same two Symplectic IDs and the associated
> EPrint ID.  Sometimes I have not even been able to find the record I
> need in this table based on a lookup against either the EPrint or
> Symplectic ID.  Tell me what you find and I maybe able to advise
> further or confirm whether this is or is not the issue you are
> experiencing.
>
> Regards
>
> David Newman
>
> On Thu, 2019-05-16 at 09:18 +0100, James Kerwin via Eprints-tech wrote:
> > Hi All,
> >
> > This may be a question for the Symplectic list, but on the off-chance
> > anybody has experienced similar problems... Has anybody had trouble
> > with merging records in Elements and the result in EPrints being a
> > complete mess?
> >
> > A record was merged recently and the results in EPrints are two
> > documents in the same folder on the server (where the file download
> > link points two).
> >
> > For example, there is usually one file per bottom level directory:
> >
> > Dir 01 = file1.pdf
> > Dir 02 = file2.pdf
> >
> > and so on.
> >
> > I'm getting:
> >
> > Dir 01 = file1.pdf, file2.pdf
> > Dir02 = file3.pdf
> >
> > Also the files showing Elements seem to be duplicating themselves and
> > keeps changing to "deposit incomplete" Always one more of the
> > duplicated file in Elements than is appearing in EPrints.
> >
> > I know I can tidy this up in EPrints, but I'd like to stop it
> > happening altogether.
> >
> > Thanks,
> > James
> > *** Options: http://mailman.ecs.soton.ac.uk/mailman/listinfo/eprints-
> > tech
> > *** Archive:
> https://eur03.safelinks.protection.outlook.com/?url=http%3A%2F%2Fwww.eprints.org%2Ftech.php%2F&amp;data=01%7C01%7Ceprints-tech%40ecs.soton.ac.uk%7Cafb968dc2007425e172b08d6dab89ce4%7C4a5378f929f44d3ebe89669d03ada9d8%7C0&amp;sdata=B975BYKdKfIeAuIfxrotidzigZrB9NFXqFAnDLJXkRs%3D&amp;reserved=0
> <https://eur03.safelinks.protection.outlook.com/?url=http%3A%2F%2Fwww.eprints.org%2Ftech.php%2F&amp;data=01%7C01%7Ceprints-tech%40ecs.soton.ac.uk%7Cafb968dc2007425e172b08d6dab89ce4%7C4a5378f929f44d3ebe89669d03ada9d8%7C0&amp;sdata=B975BYKdKfIeAuIfxrotidzigZrB9NFXqFAnDLJXkRs%3D&amp;reserved=0>
> > *** EPrints community wiki:
> https://eur03.safelinks.protection.outlook.com/?url=http%3A%2F%2Fwiki.eprints.org%2F&amp;data=01%7C01%7Ceprints-tech%40ecs.soton.ac.uk%7Cafb968dc2007425e172b08d6dab89ce4%7C4a5378f929f44d3ebe89669d03ada9d8%7C0&amp;sdata=thh6jM0JOQ8rhPbkQf6Zu7rDB%2FisztMxvPytTH2SDE8%3D&amp;reserved=0
> <https://eur03.safelinks.protection.outlook.com/?url=http%3A%2F%2Fwiki.eprints.org%2F&amp;data=01%7C01%7Ceprints-tech%40ecs.soton.ac.uk%7Cafb968dc2007425e172b08d6dab89ce4%7C4a5378f929f44d3ebe89669d03ada9d8%7C0&amp;sdata=thh6jM0JOQ8rhPbkQf6Zu7rDB%2FisztMxvPytTH2SDE8%3D&amp;reserved=0>
> > *** EPrints developers Forum:
> https://eur03.safelinks.protection.outlook.com/?url=http%3A%2F%2Fforum.eprints.org%2F&amp;data=01%7C01%7Ceprints-tech%40ecs.soton.ac.uk%7Cafb968dc2007425e172b08d6dab89ce4%7C4a5378f929f44d3ebe89669d03ada9d8%7C0&amp;sdata=SwCqCQUPKclRMMfRxi4kcdvI5PoKX9VzeASHRJMUr18%3D&amp;reserved=0
> <https://eur03.safelinks.protection.outlook.com/?url=http%3A%2F%2Fforum.eprints.org%2F&amp;data=01%7C01%7Ceprints-tech%40ecs.soton.ac.uk%7Cafb968dc2007425e172b08d6dab89ce4%7C4a5378f929f44d3ebe89669d03ada9d8%7C0&amp;sdata=SwCqCQUPKclRMMfRxi4kcdvI5PoKX9VzeASHRJMUr18%3D&amp;reserved=0>
>
> *** Options: http://mailman.ecs.soton.ac.uk/mailman/listinfo/eprints-tech
> *** Archive: https://eur03.safelinks.protection.outlook.com/?url=http%3A%2F%2Fwww.eprints.org%2Ftech.php%2F&amp;data=01%7C01%7Ceprints-tech%40ecs.soton.ac.uk%7Cafb968dc2007425e172b08d6dab89ce4%7C4a5378f929f44d3ebe89669d03ada9d8%7C0&amp;sdata=B975BYKdKfIeAuIfxrotidzigZrB9NFXqFAnDLJXkRs%3D&amp;reserved=0
> <https://eur03.safelinks.protection.outlook.com/?url=http%3A%2F%2Fwww.eprints.org%2Ftech.php%2F&amp;data=01%7C01%7Ceprints-tech%40ecs.soton.ac.uk%7Cafb968dc2007425e172b08d6dab89ce4%7C4a5378f929f44d3ebe89669d03ada9d8%7C0&amp;sdata=B975BYKdKfIeAuIfxrotidzigZrB9NFXqFAnDLJXkRs%3D&amp;reserved=0>
> *** EPrints community wiki: https://eur03.safelinks.protection.outlook.com/?url=http%3A%2F%2Fwiki.eprints.org%2F&amp;data=01%7C01%7Ceprints-tech%40ecs.soton.ac.uk%7Cafb968dc2007425e172b08d6dab89ce4%7C4a5378f929f44d3ebe89669d03ada9d8%7C0&amp;sdata=thh6jM0JOQ8rhPbkQf6Zu7rDB%2FisztMxvPytTH2SDE8%3D&amp;reserved=0
> <https://eur03.safelinks.protection.outlook.com/?url=http%3A%2F%2Fwiki.eprints.org%2F&amp;data=01%7C01%7Ceprints-tech%40ecs.soton.ac.uk%7Cafb968dc2007425e172b08d6dab89ce4%7C4a5378f929f44d3ebe89669d03ada9d8%7C0&amp;sdata=thh6jM0JOQ8rhPbkQf6Zu7rDB%2FisztMxvPytTH2SDE8%3D&amp;reserved=0>
> *** EPrints developers Forum: https://eur03.safelinks.protection.outlook.com/?url=http%3A%2F%2Fforum.eprints.org%2F&amp;data=01%7C01%7Ceprints-tech%40ecs.soton.ac.uk%7Cafb968dc2007425e172b08d6dab89ce4%7C4a5378f929f44d3ebe89669d03ada9d8%7C0&amp;sdata=SwCqCQUPKclRMMfRxi4kcdvI5PoKX9VzeASHRJMUr18%3D&amp;reserved=0
> <https://eur03.safelinks.protection.outlook.com/?url=http%3A%2F%2Fforum.eprints.org%2F&amp;data=01%7C01%7Ceprints-tech%40ecs.soton.ac.uk%7Cafb968dc2007425e172b08d6dab89ce4%7C4a5378f929f44d3ebe89669d03ada9d8%7C0&amp;sdata=SwCqCQUPKclRMMfRxi4kcdvI5PoKX9VzeASHRJMUr18%3D&amp;reserved=0>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mailman.ecs.soton.ac.uk/pipermail/eprints-tech/attachments/20190517/db66c822/attachment-0001.html