[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[EP-tech] Re: Question about full text search (Documents in Advanced Search page)



Just for those who may be having a similar problem in the future, it 
turned out to be that I need to force the indexer to do a full reindex.  
It's unfortunate that I would have to do this, since it's running all 
the time, but that's what fixed it for me.

Thanks for the help,
Mike.


On 1/27/2016 1:23 PM, Michael Street wrote:
> Hi Lizz,
>
> Thanks, yes, I have found the tables on my own and can manually insert
> terms and it works that way.  I just can't figure out where the
> disconnect is between what the Indexer is seeing and what it is or
> isn't, inserting in the db.
>
> I will check the video for more hints though, thanks.
>
> --Mike
>
>
> On 1/27/2016 11:24 AM, Lizz Jennings wrote:
>> Hi Michael,
>>
>> Have you looked at the database entries for the indexes?  Adam showed which tables to look at (at about 6 minutes in) in the troubleshooting search video:
>>
>> http://wiki.eprints.org/w/Training_Video:Search_Troubleshooting
>>
>> That might offer a hint?
>>
>> Lizz
>>
>> --
>>
>> Lizz Jennings BA MSc ACLIP MCLIP (Revalidated 2015)
>>
>> Technical Data Officer
>>
>> The Library 4.10, University of Bath, Bath, BA2 7AY UK
>>
>> Ext. 3570 (External 01225 383570)
>>
>> E.Jennings at bath.ac.uk
>>
>> ________________________________________
>> From: eprints-tech-bounces at ecs.soton.ac.uk <eprints-tech-bounces at ecs.soton.ac.uk> on behalf of Michael Street <mstreet at yorku.ca>
>> Sent: 27 January 2016 15:46
>> To: eprints-tech at ecs.soton.ac.uk
>> Subject: [EP-tech] Re: Question about full text search (Documents in Advanced Search page)
>>
>> Hi folks,
>>
>> Is there any other way to actually find out more details about what the
>> Indexer is doing?  I've turned it on verbose logging and loglevel 6, but
>> I'd like to really know exactly what terms it's found, and what it's
>> inserting, if anything, into the database.
>>
>> Thanks,
>> Mike.
>>
>> On 1/27/2016 9:46 AM, Michael Street wrote:
>>> Hi Alan,
>>>
>>> Thanks but I have tried that.  I've increased the logging verbosity and
>>> tried reindexing one of the offending deposits.  Nothing in the logs.
>>> To be honest, I see nothing in the logs but that there's no tasks.
>>>
>>> Occasionally I see something about documents being locked but the
>>> numbers don't match up.  I'm not sure how the numbering system works
>>> (for ex. 'document.5917 is locked').  I would assume though, that I
>>> would see an error message when reindexing one of the offending
>>> deposits.  I don't see anything when reindexing those, so I assume the
>>> 'locked' message has nothing to do with it.
>>>
>>> I will try the Xapian plugin later....see if that makes any difference.
>>>
>>> --Mike
>>>
>>>
>>> On 1/25/2016 4:20 AM, Alan.Stiles wrote:
>>>> Have you tried to reindex one of the missing items to see if it made a difference?  Check the error_log whilst it reindexes in case eprints is having some other issue with opening the pdf (we sometimes have issues with e.g. apostrophes in the filenames).
>>>>
>>>>
>>>> -----Original Message-----
>>>> From: eprints-tech-bounces at ecs.soton.ac.uk [mailto:eprints-tech-bounces at ecs.soton.ac.uk] On Behalf Of Michael Street
>>>> Sent: 22 January 2016 21:01
>>>> To: eprints-tech at ecs.soton.ac.uk
>>>> Subject: [EP-tech] Re: Question about full text search (Documents in Advanced Search page)
>>>>
>>>> Hi again,
>>>>
>>>> Does anyone have any idea why these documents are not showing up in the search results?
>>>>
>>>> Any suggestions would really be appreciated.  I'm at a loss as to why it's not returning results that clearly have the search term in the pdf (and the converted text document).
>>>>
>>>> --Mike Street
>>>>
>>>> On 1/15/2016 11:05 AM, Michael Street wrote:
>>>>> Hi John,
>>>>>
>>>>> Thanks very much for your response.  Please find my answers below:
>>>>>
>>>>> 1)  Indexer is running and confirmed to be working.  The documents
>>>>> that don't show up are some of the oldest and are available through
>>>>> other links.  Newly deposited items also show up in the Views.
>>>>>
>>>>> 2)  I have tried pdftotext on the system and had no issues with
>>>>> converting it.  I also was able to find the search term within the
>>>>> document easily.
>>>>>
>>>>> 3)  I run a cronjob that updates the DB and switches everything to be
>>>>> visible, every 15 minutes.  My client does not want anything to be
>>>>> hidden, especially previous versions of eprints, so this was the
>>>>> easiest way to achieve that, for me.  Also, the eprints in question do
>>>>> show up in the Views, which shows they're set to visible.
>>>>>
>>>>> So if you have any other ideas, I'd really appreciate it.  I'm at a
>>>>> loss here.
>>>>>
>>>>> Thanks,
>>>>> Mike.
>>>>>
>>>>>
>>>>> On 1/14/2016 4:35 PM, John Salter wrote:
>>>>>> Hi,
>>>>>> I'd check that you indexer is running, and that the task queue is processed.
>>>>>>
>>>>>> I'd also check that the PDFs aren't restricted in some way (maybe see what something like pdftotext returns when run against one of the not-returned PDFs.
>>>>>>
>>>>>> Also, as was mentioned in a different thread recently, check what the 'metadata visibility' flag for the EPrint is.
>>>>>>
>>>>>> If none of that gets you anywhere, let us know and we'll put our collective thinking caps on!
>>>>>>
>>>>>> Cheers,
>>>>>> John
>>>>>>
>>>>>> ________________________________________
>>>>>> From: eprints-tech-bounces at ecs.soton.ac.uk
>>>>>> <eprints-tech-bounces at ecs.soton.ac.uk> on behalf of Michael Street
>>>>>> <mstreet at yorku.ca>
>>>>>> Sent: 14 January 2016 16:04
>>>>>> To: eprints-tech at ecs.soton.ac.uk
>>>>>> Subject: [EP-tech] Question about full text search (Documents in Advanced       Search page)
>>>>>>
>>>>>> Hi,
>>>>>>
>>>>>> I've got some pdfs in the repository that include the phrase 'bohm'
>>>>>> many times but the Advanced Search page is only returning 4 out of
>>>>>> probably
>>>>>> 25+ eprints as hits on the phrase.  I'm using the Documents search
>>>>>> 25+ box,
>>>>>> which I believe it the full-text search box.  Is there something I'm
>>>>>> missing?
>>>>>>
>>>>>> Any help would be appreciated thanks, Mike.
>>>>>>
>>>>>> *** Options:
>>>>>> http://mailman.ecs.soton.ac.uk/mailman/listinfo/eprints-tech
>>>>>> *** Archive: http://www.eprints.org/tech.php/
>>>>>> *** EPrints community wiki: http://wiki.eprints.org/
>>>>>> *** EPrints developers Forum: http://forum.eprints.org/
>>>>>>
>>>>>> *** Options:
>>>>>> http://mailman.ecs.soton.ac.uk/mailman/listinfo/eprints-tech
>>>>>> *** Archive: http://www.eprints.org/tech.php/
>>>>>> *** EPrints community wiki: http://wiki.eprints.org/
>>>>>> *** EPrints developers Forum: http://forum.eprints.org/
>>>>> *** Options:
>>>>> http://mailman.ecs.soton.ac.uk/mailman/listinfo/eprints-tech
>>>>> *** Archive: http://www.eprints.org/tech.php/
>>>>> *** EPrints community wiki: http://wiki.eprints.org/
>>>>> *** EPrints developers Forum: http://forum.eprints.org/
>>>> *** Options: http://mailman.ecs.soton.ac.uk/mailman/listinfo/eprints-tech
>>>> *** Archive: http://www.eprints.org/tech.php/
>>>> *** EPrints community wiki: http://wiki.eprints.org/
>>>> *** EPrints developers Forum: http://forum.eprints.org/
>>>> -- The Open University is incorporated by Royal Charter (RC 000391), an exempt charity in England & Wales and a charity registered in Scotland (SC 038302). The Open University is authorised and regulated by the Financial Conduct Authority.
>>>>
>>>> *** Options: http://mailman.ecs.soton.ac.uk/mailman/listinfo/eprints-tech
>>>> *** Archive: http://www.eprints.org/tech.php/
>>>> *** EPrints community wiki: http://wiki.eprints.org/
>>>> *** EPrints developers Forum: http://forum.eprints.org/
>>> *** Options: http://mailman.ecs.soton.ac.uk/mailman/listinfo/eprints-tech
>>> *** Archive: http://www.eprints.org/tech.php/
>>> *** EPrints community wiki: http://wiki.eprints.org/
>>> *** EPrints developers Forum: http://forum.eprints.org/
>> *** Options: http://mailman.ecs.soton.ac.uk/mailman/listinfo/eprints-tech
>> *** Archive: http://www.eprints.org/tech.php/
>> *** EPrints community wiki: http://wiki.eprints.org/
>> *** EPrints developers Forum: http://forum.eprints.org/
>> *** Options: http://mailman.ecs.soton.ac.uk/mailman/listinfo/eprints-tech
>> *** Archive: http://www.eprints.org/tech.php/
>> *** EPrints community wiki: http://wiki.eprints.org/
>> *** EPrints developers Forum: http://forum.eprints.org/
> *** Options: http://mailman.ecs.soton.ac.uk/mailman/listinfo/eprints-tech
> *** Archive: http://www.eprints.org/tech.php/
> *** EPrints community wiki: http://wiki.eprints.org/
> *** EPrints developers Forum: http://forum.eprints.org/