Dumb question- using the cjk analyzer


(nathan.moore@gmail.com) #1

Hi everyone,

Sorry for the n00b post. I just started using ElasticSearch for a
dynamic website search. In English, everything just works great.
Love the json interface- was able to create an index easily, query it
easily, and integrate it into the website easily.

It went so well, that I've been asked to use ElasticSearch for a
Japanese text website search. Should be no problem, the documentation
says Lucene's "cjk" analyzer is supported.

So how do I do it? Everything's in utf8, I try to execute a query:
curl -XGET 'http://localhost:9200/MySite/search/_search?pretty=1&q=
\uff52\uff49\uff50'
This fails to return any results. Naturally- I need the cjk analyzer
to break out the Japanese characters. Cjk will do it:
curl 'localhost:9200/MySite/_analyze?pretty=1&analyzer=cjk' -d '
\uff52\uff49\uff50'
And this breaks out the tokens without a problem.

Doing what I thought was the obvious:
curl 'localhost:9200/MySite/search/_search?pretty=1&q=
\uff52\uff49\uff50' -d {
"settings":{
"analysis":{
"analyzer":"cjk"
}
}
}'

This gives an "unable to parse" error, so obviously, I've done
something really stupid with my syntax.

Hence my n00b question: what's wrong with my syntax? What dumb thing
have I done? Do I need to set some default elsewhere to use the cjk
analyzer?

Thanks in advance,

-Nathan


(James Cook) #2

Hi Nathan,

To override the default analyzer for a particular index, you would do this:

curl -XPUT 'http://localhost:9200/myindex' -d '
{
"settings": {

"analysis" : {

        "analyzer" : {
            "default" : {
                "type" : "cjk"

}

        }
    }
}

}'

If you want to do the same across all indicies, you can add the similar configuration to the elasticsearch.yml file under the 'index' settings.


(nathan.moore@gmail.com) #3

Thank you so much! That was fast. Worked too :slight_smile:

-Nathan

On Sep 22, 12:26 pm, James Cook jc...@tracermedia.com wrote:

Hi Nathan,

To override the default analyzer for a particular index, you would do this:

curl -XPUT 'http://localhost:9200/myindex'-d '
{
"settings": {

"analysis" : {

        "analyzer" : {
            "default" : {
                "type" : "cjk"

}

        }
    }
}

}'

If you want to do the same across all indicies, you can add the similar configuration to the elasticsearch.yml file under the 'index' settings.


(system) #4