|
|||
|
I would like to use the Lucene collections created by the IBM Omnifind Yahoo! edition search engine. Using a simple program I have successfully opened the index and I am able to execute all of the basic functions and return results that match what I see when I inspect the index using Luke. The problem I am having is that I am not returning any results when I perform a find() on the field named _plain. (This is the primary field where all the Omnifind index terms are stored.) I suspect that it is due to the character set used by Omnifind to store terms in the _plain field. Oddly enough I am able to return results from all of the other fields...
** update ** After some testing I have discovered that I am getting results for the query but the score() method is filtering out all scores of 0. This is due to a value of 0 being returned by the norm() method... Here is what happens: The Lucene index created by Omnifind seems to have several empty elements in the field/normalization factor array. $this->_norms array ( [0] => yyy||||||| [4] => |||||||||| [8] => |||||||||| [11] => [12] => [1] => |€€€€||||„ [2] => xv|vx|xxvy [3] => †††††††††† [5] => uuuuuuuuuu [6] => |||||||||| [7] => eppqmjjeeq [9] => [10] => [13] => ) The segInfo->norm() method calls PHP Code:
Since $this->_norms[$fieldNum] is empty the returned value is 0! This bubbles all the way back to the score() and keeps the record out of the results. ** another update ** It appears that there is a bug in the _loadNorm($fieldNum) method of Zend_Search_Lucene_Index_SegmentInfo. Rather than taking the value of the passed $fieldNum the function loops over the $this->_fields array loading all of the fields. This is corrupting the $_norms array and that is why there were empty elements in the array. Queries on the Omnifind index works fine if I comment out the foreach loop and use just a single line to load the norm file for the appropriate field. PHP Code:
Last edited by jsloan : 04-01-2008 at 08:19 PM. |
![]() |
| Thread Tools | |
| Display Modes | |
|
|