Hi all,
I have an issue with ES 0.19.3 regarding to text_phrase_prefix query. When
number of documents indexed in ES is small the following query
works perfectly (type "New Y", not "New York" or "New Yo")
curl -X DELETE http://es1:9200/cities
curl -X POST "http://es1:9200/cities/city" -d '{ "city" : "New York" }'
curl -X POST "http://es1:9200/cities/city" -d '{ "city" : "North New York"
}'
curl -X POST "http://es1:9200/cities/city" -d '{ "city" : "East New York" }'
curl -XGET http://es1:9200/cities/city/_search?pretty=true -d'
{
"fields":[
"city"
],
"query":{
"text_phrase_prefix": {
"city" : {
"query": "New Y",
"max_expansions": 2,
"prefix_length": 2
}
}
},
"from":0,
"size":20
}'
returns all cities or areas has "New York" in their names.
{
"took" : 2,
"timed_out" : false,
"_shards" : {
"total" : 5,
"successful" : 5,
"failed" : 0
},
"hits" : {
"total" : 3,
"max_score" : 0.38356602,
"hits" : [ {
"_index" : "cities",
"_type" : "city",
"_id" : "4ObkgggqS7uou1XLdwOkfA",
"_score" : 0.38356602,
"fields" : {
"city" : "New York"
}
}, {
"_index" : "cities",
"_type" : "city",
"_id" : "CZutMgvwSfa8O79Vajkshg",
"_score" : 0.30685282,
"fields" : {
"city" : "North New York"
}
}, {
"_index" : "cities",
"_type" : "city",
"_id" : "ZGA3gno9QnOIBg2MxxsPbg",
"_score" : 0.30685282,
"fields" : {
"city" : "East New York"
}
} ]
}
}
However when the number of indexed document grows up (more than 30 000
cities or towns or areas in US), the above query does not work any more.
I need to increase max_expansions into a number that is greater than 17 (18
and greater to be specific) to make it work again. Any number that
is smaller than 17 does not work. If I don't increase max_expansions, I
need to use keywords like: "New Yo" or "New York"
curl -XGET http://184.72.29.x:9200/cities/city/_search?pretty=true -d'
{
"fields":[
"area_label"
],
"query":{
"text_phrase_prefix": {
"area_label" : {
"query": "New Y",
"max_expansions": 18,
"prefix_length": 2
}
}
},
"from":0,
"size":20
}
'
returns
{
"took" : 7,
"timed_out" : false,
"_shards" : {
"total" : 5,
"successful" : 5,
"failed" : 0
},
"hits" : {
"total" : 3,
"max_score" : 111.86473,
"hits" : [ {
"_index" : "cities",
"_type" : "city",
"_id" : "195232",
"_score" : 111.86473,
"fields" : {
"area_label" : "New York"
}
}, {
"_index" : "cities",
"_type" : "city",
"_id" : "46727",
"_score" : 89.49178,
"fields" : {
"area_label" : "North New York"
}
}, {
"_index" : "cities",
"_type" : "city",
"_id" : "46772",
"_score" : 89.49178,
"fields" : {
"area_label" : "East New York"
}
} ]
}
}
prefix_length does not play any role in this case. I increase the value of
prefix_length to 20, the result is still the same.
I don't understand why the number of 18 is magic in this case. I guess that
there is a relationship between max_expansions and the number of
indexed document. So when the amount of indexed documents increases, I need
to increase max_expansions too or the above query does not work again.
Am I missing something?
Regards,
Dinh
--