[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[EP-tech] Making a static copy of an EPrints repo



I would use:

     wget --no-parent \
          --no-check-certificate \
          --html-extension \
          --convert-links \
          --restrict-file-names=windows \
          --recursive \
          --level=inf \
          -N \
          --page-requisites \
          -e robots=off \
          --wait=0 \
          --quota=inf \

I think --convert-links will do the job of converting links.


Il 18/07/2017 11:04, Ian Stuart ha scritto:
> I need to make a read-only, static, copy of an old repo (the hardware is
> dying, the installation was heavily tailored for the environment, and I
> don't have the time to re-create in a new environment.)
>
> I can grab all the active pages:
>
>     wget --local-encoding=UTF-8 --remote-encoding=UTF-8 --no-cache
> --mirror -nc -k http://my.repo/
>
> This is good, however it doesn't edit all the absolute URLs in the view
> pages, so we need to modify them:
>
>     find my.repo -type f -exec sed -i 's_http://my.repo/_/_g' {} +
>
> However this leaves me with the problem that the http://my.repo/nnn/
> pages haven't been pulled down!
>
> Any suggestions on how to do this?
>
> Cheers
>