Hi!
I just noticed that every time I ask for a facet my whole query
behaves like if it were a facet_filer, calculating first the facets
for the whole index and using the query to filter the results instead
of the other way around. I'm not sure if this is an issue, or simply
the way ES works.
I'm made the following experiment on a 10 million doc index:
first I ask for documents with impossible dates
curl -XPOST 'http://localhost:9200/ng_test_10m/_search?
search_type=count&pretty=1' -d'
{
"size": 0,
"query": {
"range": {
"date": {
"from": "2011-07-01T13:00:00Z",
"to": "2011-06-30T14:30:00Z"
}
}
}
}
'
it comes back in 2 milliseconds with no results
{
"took" : 2,
"timed_out" : false,
"_shards" : {
"total" : 5,
"successful" : 5,
"failed" : 0
},
"hits" : {
"total" : 0,
"max_score" : 0.0,
"hits" : [ ]
}
}
but if I take the same query and a facet to some field ( values are an
array of integers )
curl -XPOST 'http://localhost:9200/ng_test_10m/_search?
search_type=count&pretty=1' -d'
{
"size": 0,
"query": {
"range": {
"date": {
"from": "2011-07-01T13:00:00Z",
"to": "2011-06-30T14:30:00Z"
}
}
},
"facets": {
"categories": {
"terms": {
"field": "categories"
}
}
}
}
'
it takes 20 seconds to come back without results
{
"took" : 19669,
"timed_out" : false,
"_shards" : {
"total" : 5,
"successful" : 5,
"failed" : 0
},
"hits" : {
"total" : 0,
"max_score" : 0.0,
"hits" : [ ]
},
"facets" : {
"categories" : {
"_type" : "terms",
"missing" : 0,
"terms" : [ ]
}
}
}
I don't know if I should open an issue or this is the expected
behavior, maybe to make the change of scope easy?
It is a performance issue when facet performance degrades linearly as
the index grows independently from the number of documents you want to
look at.
thanks