I have this problem.
I'm add a document to the index, with headline, url, contents and a md5 hashed unique string, to determine the record added to the index.
When I need to update the record, I first search for the md5 string, then I remove the document from the index, optimize it and then add the updated document back into the index.
And now the fun (

) part starts.
If I now try to search for this md5 string, its no longer searchable. And if I do updates, I get new document, instead of an updated one.
The update / remove code is here:
PHP Code:
public function updateIndex( Cms_SearchEngine_Value $obj )
{
$doc = new Zend_Search_Lucene_Document();
if ( $this->lucene == null )
{
$this->lucene = Zend_Search_Lucene::open( $this->path );
}
$this->removeIndex( $obj );
$doc->addField( Zend_Search_Lucene_Field::Keyword('link', $obj->getPath(), 'utf-8' ));
$doc->addField( Zend_Search_Lucene_Field::Keyword('hashid', $obj->getID(), 'utf-8' ));
$doc->addfield( Zend_Search_Lucene_Field::Text('title', $obj->getTitle(), 'utf-8' ));
$doc->addField( Zend_Search_Lucene_Field::UnStored('contents', $obj->getContent(), 'utf-8' ));
Cms_Logger::getInstance()->info( 'Adding '.$obj->getTitle().' to the index', get_class($this), Cms_Logger::SEARCH_ENGINE );
$this->lucene->addDocument( $doc );
$this->lucene->optimize();
}
public function removeIndex( Cms_SearchEngine_Value $obj )
{
if ( $this->lucene == null )
{
$this->lucene = Zend_Search_Lucene::open( $this->path );
}
Cms_Logger::getInstance()->info('Looking up ID: ' . $obj->getID(), get_class( $this), Cms_Logger::SEARCH_ENGINE );
$hits = $this->lucene->find( 'hashid' . $obj->getID() );
$counter = 0;
foreach ( $hits as $hit )
{
Cms_Logger::getInstance()->info('Working with Lucene ID: '.$hit->id.'and HashID: '.$hit->hashid, get_class($this), Cms_Logger::SEARCH_ENGINE );
$this->lucene->delete( $hit->id);
$counter++;
}
Cms_Logger::getInstance()->info('Found ' . $counter . ' instances of the ID ' . $obj->getID(), get_class($this), Cms_Logger::SEARCH_ENGINE );
Cms_Logger::getInstance()->info( 'Removed '.$obj->getTitle().' from the index', get_class($this), Cms_Logger::SEARCH_ENGINE );
$this->lucene->optimize();
}
Anyone ? ( And, yes, I need utf-8, since I live in Iceland and building multi-lingual websites. )