I'm trying to create an autosuggest feature with the completion type and i would like to filter the input with the length token filter (to limit the input to entries that are at least X characters long). Sadly, it seems that the analyzer value of the suggest-field is completely ignoring the token filters I set for the suggest field, both when indexing and when suggesting... Here's my test-setup:
Creating the index:
curl -XPUT "http://localhost:9200/test_index/" -d'
{
"settings": {
"number_of_shards": 1,
"number_of_replicas": 0,
"analysis": {
"filter": {
"length_gt": {
"type": "length",
"min": 10
}
},
"analyzer": {
"suggestions": {
"type": "custom",
"tokenizer": "lowercase",
"filter": [
"length_gt"
]
}
}
}
},
"mappings": {
"product": {
"properties": {
"description": {
"type": "string"
},
"tags": {
"type": "string"
},
"title": {
"type": "string"
},
"tag_suggest": {
"type": "completion",
"analyzer": "suggestions",
"search_analyzer": "suggestions",
"payloads": false
}
}
}
}
}'
Indexing two Documents:
curl -XPUT "http://localhost:9200/test_index/product/1" -d'{
"title": "Product1",
"description": "Product1 Description",
"tags": [
"blog",
"magazine",
"responsive",
"two columns",
"wordpress"
],
"tag_suggest": {
"input": [
"blog",
"magazine",
"responsive",
"two columns",
"wordpress"
]
}
}'
curl -XPUT "http://localhost:9200/test_index/product/2" -d'
{
"title": "Product2",
"description": "Product2 Description",
"tags": [
"blog",
"paypal",
"responsive",
"skrill",
"wordland"
],
"tag_suggest": {
"input": [
"blog",
"paypal",
"responsive",
"skrill",
"wordland"
]
}
}'
Inspecting the index content:
curl -XPOST "http://localhost:9200/test_index/_search"
{"took":1,"timed_out":false,"_shards":{"total":1,"successful":1,"failed":0},"hits":{"total":1,"max_score":1.0,"hits":[{"_index":"test_index","_type":"product","_id":"1","_score":1.0,"_source":{
"title": "Product1",
"description": "Product1 Description",
"tags": [
"blog",
"magazine",
"responsive",
"two columns",
"wordpress"
],
"tag_suggest": {
"input": [
"blog",
"magazine",
"responsive",
"two columns",
"wordpress"
]
}
}}]}}
EDIT:
Just be sure that the analyzer itself works, I tried to analyze some text with my 'suggestions' analyzer directly:
curl -XGET 'localhost:9200/test_index/_analyze' -d '
{
"analyzer" : "suggestions",
"text" : ["all this should be removed because it is too short", "the looooooooooooong words are the oooooooooooonly thing left"]
}'
which worked fine:
{
"tokens": [
{
"end_offset": 71,
"position": 111,
"start_offset": 55,
"token": "looooooooooooong",
"type": "word"
},
{
"end_offset": 101,
"position": 115,
"start_offset": 86,
"token": "oooooooooooonly",
"type": "word"
}
]
}
Is there any reason why this wouldn't work? I have tried the stopwords filter as well, it is also ignored. It seems to me as if this is simply a bug in ES - but IMHO this should work, since there are even specific settings for the completion field to define both the (indexing) analyzer and the query analyzer to use. I'm getting really desperate here - anybody have any idea why this is happening?