Analyze with smartcn get messy code


(Roro Codeath) #1

here is full steps, how to make response tokens are Chinese not messy code

curl -XDELETE localhost:9200/test_chinese
{"acknowledged":true}

roroco@roroco ~/Dropbox/rbcurl -XPUT localhost:9200/test_chinese -d '{
>   "settings": {
>     "index": {
>       "analysis": {
>         "analyzer": {
>           "default": {
>             "type": "smartcn"
>           }
>         }
>       }
>     }
>   }
> }'
{"acknowledged":true,"shards_acknowledged":true}

roroco@roroco ~/Dropbox/rbs/ro_elasticsearch $ curl localhost:9200/test_chinese/_analyze?text='这个不错'
{"tokens":[{"token":"│","start_offset":0,"end_offset":1,"type":"word","position":0},{"token":"﾿","start_offset":1,"end_offset":2,"type":"word","position":1},{"token":"ル","start_offset":2,"end_offset":3,"type":"word","position":2},{"token":"¦","start_offset":3,"end_offset":4,"type":"word","position":3},{"token":"ᄌ","start_offset":4,"end_offset":5,"type":"word","position":4},{"token":"ᆰ","start_offset":5,"end_offset":6,"type":"word","position":5},{"token":"¦","start_offset":6,"end_offset":7,"type":"word","position":6},{"token":"ᄌ","start_offset":7,"end_offset":8,"type":"word","position":7},{"token":"ヘ","start_offset":8,"end_offset":9,"type":"word","position":8},{"token":"←","start_offset":9,"end_offset":10,"type":"word","position":9},{"token":"ヤ","start_offset":10,"end_offset":11,"type":"word","position":10},{"token":"ル","start_offset":11,"end_offset":12,"type":"word","position":11}]}roroco@roroco ~/Dropbox/rbs/ro_elasticsearch $

(Roro Codeath) #2

I find the solution, even &text=很不错 doesn't work, I can use text field in body instead:

curl localhost:9200/test_chinese/_analyze?analyzer=smartcn -d '{"text": "很不错"}'

roroco@roroco ~/Dropbox/rbs/ro_elasticsearch $ curl localhost:9200/test_chinese/_analyze?analyzer=smartcn -d '{"text": "很不错"}'
{"tokens":[{"token":"很","start_offset":0,"end_offset":1,"type":"word","position":0},{"token":"不错","start_offset":1,"end_offset":3,"type":"word","position":1}]}

(system) #3

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.