[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[EP-tech] Re: Indexing files inside a compressed (zip, rar) document


Hi Andras, 

If you're on 3.3, have a look inside lib/cfg.d/search_xapian.pl and look
where it does the full text indexing (look for "Convert"). It might just
be easier to write a zip/tar to "text/plain" Convert plug-in. 

Note that at the moment there are no consideration on the "security"
settings of the documents. It might something you need to address when
you're indexing research data. 

If you come up with something, feel free to share your work on e.g.
GitHub (github.com/eprints/eprints), thanks! 


On 17.02.2014 14:39, Andr?s Micsik wrote: 

> Hi,
> do you have any hint on how to extend the indexer to index the inside of 
> zip/rar/etc archives? Is there any ready solution for this, or do I have to 
> write an indexer plugin? The rationale behind: the large number of files 
> contain research data, so they are easiest handled as a zip, but still would 
> be nice to search inside...
> thanks,

-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mailman.ecs.soton.ac.uk/pipermail/eprints-tech/attachments/20140217/7e195b71/attachment.html