Hi everyone,
I'm trying to get the Edge nGram token filter working based on the following documentation: https://www.elastic.co/guide/en/elasticsearch/guide/current/_index_time_search_as_you_type.html
I'm using version 1.7 and I can't get the same result as what is described in the docs. Here's my index settings:
$ curl -XPUT --data-binary @index_settings.json http://localhost:9200/test_ngram { "settings": { "number_of_shards": 1, "number_of_replicas": 0, "mappings": { "my_type": { "properties": { "name": { "type": "string", "analyzer": "autocomplete" } } } }, "analysis": { "filter": { "autocomplete_filter": { "type": "edge_ngram", "min_gram": 1, "max_gram": 20 } }, "analyzer": { "autocomplete": { "type": "custom", "tokenizer": "standard", "filter": [ "lowercase", "autocomplete_filter" ] } } } } }
I'm indexing some objects:
$ curl -XPOST --data-binary @docs.json http://localhost:9200/test_ngram/my_type/_bulk { "index": { "_id": 1 }} { "name": "Brown foxes" } { "index": { "_id": 2 }} { "name": "Yellow furballs" }
Now trying to search:
$ curl -XPOST --data-binary @search.json localhost:9200/test_ngram/my_type/_search?pretty { "took" : 1, "timed_out" : false, "_shards" : { "total" : 1, "successful" : 1, "failed" : 0 }, "hits" : { "total" : 1, "max_score" : 0.15891947, "hits" : [ { "_index" : "test_ngram", "_type" : "my_type", "_id" : "1", "_score" : 0.15891947, "_source":{ "name": "Brown foxes" } } ] } }
So this returns only the first document not the second one unlike the documentation result. I run an explain on the query and the result show that it doesn't seem to be analyzed with the edge_ngram token filter:
$ curl -XPOST --data-binary @search.json 'localhost:9200/test_ngram/my_type/_validate/query?explain&pretty' { "valid" : true, "_shards" : { "total" : 1, "successful" : 1, "failed" : 0 }, "explanations" : [ { "index" : "test_ngram", "valid" : true, "explanation" : "filtered(name:brown name:fo)->cache(_type:my_type)" } ] }
It searches for the whole terms "brown" or "fo" but not "b", "br", "bro" and so on which should be the expected behavior thus returning the two documents when searching. I also tried to force the analyzer by setting both index_analyzer and search_analyzer with no luck.
I'm pretty sure I'm doing something wrong but I can't put my finger on it. Does anyone have any clue ?
Thanks