|
|||
|
|
|
|||
|
i tried it, but nothing's happend.
you can test with this some words: á ạ ầ ą ZF can't index exactly the utf8 character , and when i search it , i can't read these character (VIEW / CHARACTER ENCODING / UTF-8 ) Last edited by rassen : 04-23-2008 at 11:45 AM. |
|
|||
|
You could write text analyzer to replace non-standard characters to their equivalents. For instance you can replace
'ą' to 'a' or more complex 'ą' to 'xxxaxxx' and vice-versa during search. Tomorrow I will send you sample code. It works perfectly. |
|
|||
|
Quote:
Code:
class Lucene_Helper {
protected $_find = array('ą','ż','ś','ź','ę','ć','ń','ó','ł','Ą','Ż','Ś','Ź','Ę','Ć','Ń','Ó','Ł');
protected $_replace = array('a','z','s','x','e','c','n','o','l','A','Z','S','X','E','C','N','O','L');
/**
*
* @param string $string
* @return string
*/
public function simplify($string) {
foreach ($this->_find as $key => $value) {
$string = str_replace($value, 'xxx' . $this->_replace[$key] . 'xxx', $string);
}
$string = iconv('UTF-8', 'ASCII//TRANSLIT', $string);
return $string;
}
/**
*
* @param string $string
* @return string
*/
function unsimplify($string) {
$string = iconv('ASCII//TRANSLIT', 'UTF-8', $string);
foreach ($this->_replace as $key => $value) {
$string = str_replace('xxx' . $value . 'xxx', $this->_find[$key], $string);
}
return $string;
}
}
Code:
$luceneHelper = new Lucene_Helper();
$doc->addField(Zend_Search_Lucene_Field::UnStored('subject', $luceneHelper->simplify($this->subject)));
$doc->addField(Zend_Search_Lucene_Field::UnStored('body', $luceneHelper->simplify($this->body)));
Code:
$queryStr = $luceneHelper->simplify('out query with zażółć gęsią jaźń ;)');
$query = Zend_Search_Lucene_Search_QueryParser::parse($queryStr);
Code:
$luceneHelper = new BluePaprica_Helper_Lucene(); $post = $postDAO->find($post_id)->current(); $highlightedBody = $this->query->highlightMatches($luceneHelper->simplify($post->body)); $highlightedSubject = $this->query->highlightMatches($luceneHelper->simplify($post->subject)); |
|
|||
|
just thank u. i'll check you're example code.
so , i have one more question. With UTF-8 data, can i find out it with normal keywords ? example: Here's my data string: "abc ćńó xyz" and when i search with query: "cno" the result with return my above record. Can it be? |
|
|||
|
Quote:
the search query can be proccess, however data return is not UTF-8 data. I thinked about this solution, perhaps for guarantee my data, i need store one field with 2 version - Pure non-utf8 data - and utf8 data one for search, and one for display in search result ![]() too much cost. (sorry for my english skill) |
![]() |
| Thread Tools | |
| Display Modes | |
|
|