Trouble searching for non-english unicode terms


(Alexander Karelas) #1

I have a bunch of texts in unicode greek (with a few words in english in
them).

I've indexed them plainly, with the following mapping:

 properties => {
     subject     => { type   => 'string', store => 'yes' },
     rss_body    => { type   => 'string' },
     date        => { type   => 'date' },
     url         => { type   => 'string' },
     author      => { type   => 'string', index => 'not_analyzed' }
 }

...but when I try to search for a greek word in the subject as follows: ...

     query   => { term => { subject => $terms } }

...I get 0 results. Searching for an english word that exists, returns
results normally.

I tried adding: analyzer => "icu_normalizer" to the mapping of the
"subject" field, but nothing different happened.

My question: Is this normal behaviour for an unconfigured installation
of elasticsearch? What can I do to solve this problem?

Thank you,

A.


(Clinton Gormley) #2

Hiya

Unicode greek does work - see this URL for an example:

http://dev.iannounce.net/search?keywords=ελληνική+γλώσσα&date_limit=&date=&type=all_notices&_fstatus=search

...but when I try to search for a greek word in the subject as follows: ...

     query   => { term => { subject => $terms } }

You don't want a terms search here, you want a query_string or field:

Try: query => { field => { subject => $terms }}

clint

--
Web Announcements Limited is a company registered in England and Wales,
with company number 05608868, with registered address at 10 Arvon Road,
London, N5 1PR.


(system) #3