Here is program in which I create index with some example data:
PHP Code:
$itemId = 3245;
$title = 'деякий екземпловий текст'; // here is some cyrillic text
setlocale(LC_CTYPE, 'uk_UA.UTF-8');
Zend_Search_Lucene_Analysis_Analyzer::setDefault(
new Zend_Search_Lucene_Analysis_Analyzer_Common_Utf8());
Zend_Loader::loadClass('Zend_Search_Lucene');
$index = Zend_Search_Lucene::create('tmp/index');
$doc = new Zend_Search_Lucene_Document();
$doc->addField(Zend_Search_Lucene_Field::UnIndexed('itemId', $itemId));
$doc->addField(Zend_Search_Lucene_Field::Text('url', $url));
$doc->addField(Zend_Search_Lucene_Field::Keyword('title', $title));
$doc->addField(Zend_Search_Lucene_Field::UnStored('contents', $contentText)); // $contentText is variable comes with cyrillic text
$index->addDocument($doc);
After executing this code, try to find indexed text:
PHP Code:
$index = Zend_Search_Lucene::open('tmp/index');
$hits = $index->find($this->getRequest()->query); // contains cyrillic word which is also contains in indexed text
foreach($this->items as $item)
{
echo $item->title;
echo '<br />';
echo $item->url;
}
But $index->find(), returns empty result! Please, help me understand what wrong? Program works fine if operates with text contains latin symbols. Why here is important latin or cyrillic symbols I use?
Thank you in advance!