Welcome, Guest. Register Now!
   
Mark Forums Read Mark Forums Read Mark Forums Read


Reply
 
LinkBack Thread Tools Display Modes
  #1 (permalink)  
Old 09-21-2007, 01:49 PM
Junior Member
 
Join Date: Sep 2007
Posts: 3
Default Unable to find Documents by numeric values

Hello there,

I just set up the Zend Lucene Search Engine (ZF 1.0.1) and want to use it for indexing a huge number of userdata. I already have a unique numeric ID for each user, so I try to reuse that for retrieving the document for a certain user (in order to delete and re-add it).

The problem is now: Lucene does not let me search for numeric stuff.
I tried all this:

1. Using the Zend_Search_Lucene_Field::Text function when adding fields, instead of Zend_Search_Lucene_Field::Keyword,
2. Instead of adding ...Keyword('id', $userId) I used a custom fieldname

But - in all cases, what ever I do , I am facing the Issues that
1. Lucene overwrites the id with it's own if I write ...::Keyword('id', $userId)
2. Disregarding if I search in the predefined id column or in my custom 'userid' column, - when I use numeric values for the ID, Lucene does not find any hit.

So, it seems - in this Version of Lucene - it overwrites the Document-id ALWAYS with it's own id, and searching for numeric values is simply not possible.

My Workaround was now, to convert the numeric ID into some String (e.g. a hash) - then it works perfectly - but it's not a very clean solution in my opinion.

So, who can tell my what I'm doin wrong?

Here's the code snippet how I'm indexing:
Code:
$index = new Zend_Search_Lucene($location);
$doc = new Zend_Search_Lucene_Document();
$doc->addField(Zend_Search_Lucene_Field::Text('id' , $userId, 'utf-8'));
$doc->addField(Zend_Search_Lucene_Field::Keyword('userid' ,$userId, 'utf-8'));
$index->addDocument($doc);
$index->commit();
And this is how I try to search:

Code:
// Search for lucene internal ID which has been overwritten but does not work anyway
$hits = $index->find('id:113');

// Search for my Custom user id which also does not deliver any hit
$hits = $index->find('userid:113');
Btw: I already approved, all the data is stored correctly in the index - at least I can verify all the data when I display the content of all documents in the index.

Thank you in advance!
Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #2 (permalink)  
Old 09-26-2007, 01:29 PM
Junior Member
 
Join Date: Sep 2007
Posts: 3
Default

Okay, I got the answer myself: Just do not use the default analyzer.
Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #3 (permalink)  
Old 12-20-2007, 09:04 PM
Junior Member
 
Join Date: Dec 2007
Posts: 1
Default What analyser works?

Could you post what analyzer you found that works for numeric values?
Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #4 (permalink)  
Old 01-04-2008, 12:28 AM
Junior Member
 
Join Date: Dec 2007
Posts: 3
Default

I am having problems with this myself.. I have tried a few different analyzers including UTF8 one and the "TextNum" ones with no luck. I have tried doing the numeric field as a Text/Keyword/UnStored also.

Any help is appreciated...

- aba
Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #5 (permalink)  
Old 01-18-2008, 02:25 AM
Junior Member
 
Join Date: Jan 2008
Posts: 4
Default

I would also be interested to see what analyzer allows integer searches.

FYI, I used md5 to convert my database key to a unique searchable entry for each index record:

PHP Code:
$index Zend_Search_Lucene::open($index_path);
$uid md5($database_id);
$term = new Zend_Search_Lucene_Index_Term($uid'uid');
$uid = new Zend_Search_Lucene_Search_Query_Term($term);
$hits $index->find($uid);
... 
Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #6 (permalink)  
Old 08-15-2008, 03:49 AM
Junior Member
 
Join Date: Aug 2008
Posts: 1
Default

Ok, after messing with this for a long time I figured out that other people can't search for numeric data. Here is a solution:
PHP Code:
$index Zend_Search_Lucene::open();
$pathTerm  = new Zend_Search_Lucene_Index_Term('$numericData''fieldname');
$pathQuery = new Zend_Search_Lucene_Search_Query_Term($pathTerm);
$query = new Zend_Search_Lucene_Search_Query_Boolean();
$query->addSubquery($pathQuerytrue /* required */);
$hits $index->find($query); 
Note: fieldname can be omitted but obviously if you are looking to delete a Lucene Doc to update it you want to search for the exact field.
Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
Reply


Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

vB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are On



All times are GMT. The time now is 05:23 PM.