It appears to me they do.
(Apologies if this is a repost. Posted over an hour ago and it hasn't shown
up here.)
#!/bin/sh
echo "\nattempt to delete the index"
curl -XDELETE "http://localhost:9200/syndex/?pretty=false"
echo "\ncreate the index"
curl -XPUT "http://localhost:9200/syndex/?pretty=true" -d '{
"settings": {
"number_of_shards": 1,
"number_of_replicas": 0,
"analysis": {
"analyzer": {
"syn": {
"tokenizer": "keyword",
"filter": ["color_synonym"]
}
},
"filter": {
"color_synonym": {
"type" : "synonym",
"synonyms" : ["red, another shade"]
}
}
}
}
}'
echo "\n analyze red: I want a single token another shade"
echo "\n instead I get two tokens another & shade"
curl -XGET "localhost:9200/syndex/_analyze?analyzer=syn&pretty=true" -d "red"
echo "\n sanity check how keyword tokenizer handes another shade"
curl -XGET "localhost:9200/syndex/_analyze?analyzer=syn&pretty=true" -d "another shade"
echo "\n and of course it does not split them"
Generates the output
attempt to delete the index
{"ok":true,"acknowledged":true}
create the index
{
"ok" : true,
"acknowledged" : true
}
analyze red: I want a single token another shade
instead I get two tokens another & shade
{
"tokens" : [ {
"token" : "red",
"start_offset" : 0,
"end_offset" : 3,
"type" : "SYNONYM",
"position" : 1
}, {
"token" : "another",
"start_offset" : 0,
"end_offset" : 3,
"type" : "SYNONYM",
"position" : 1
}, {
"token" : "shade",
"start_offset" : 0,
"end_offset" : 3,
"type" : "SYNONYM",
"position" : 2
} ]
}
sanity check how keyword tokenizer handes another shade
{
"tokens" : [ {
"token" : "another shade",
"start_offset" : 0,
"end_offset" : 13,
"type" : "word",
"position" : 1
} ]
}
and of course it does not split them
--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.