Term facet doesn't work correct when index consists from several shards and
terms frequency differs on sundry shards.
Here is instruction how to reproduce that bug:
First of all, we need to create index 'test' within 2 shards. After that,
execute 3 times:
After that, we have seven documents, four of them on first shard (routing
0) and three on the second shard.
Now execute query:
{
"facets": {
"value": {
"terms": {
"field": "value",
"size": 1
}
}
},
"query": { "match_all": {} },
"size": 0
}
The response is:
{
.....
"facets" : {
"value" : {
"_type" : "terms",
"missing" : 0,
"total" : 7,
"other" : 4,
"terms" : [{
"term" : 5,
"count" : 3
}
]
}
}
}
But frequency of '5' term is 4!
If I change facet size to 2, I will receive right response:
{
.....
"facets" : {
"value" : {
"_type" : "terms",
"missing" : 0,
"total" : 7,
"other" : 0,
"terms" : [{
"term" : 5,
"count" : 4
}, {
"term" : 7,
"count" : 3
}
]
}
}
}
Elastic Search engine takes [size] of most frequent terms on each shard,
not on whole index. It's ok for query_and_fetch request, but when I execute
query_then_fetch I expect to receive right answer.
Can you fix that bug, please?
Term facet doesn't work correct when index consists from several shards and terms frequency differs on sundry shards.
Here is instruction how to reproduce that bug:
First of all, we need to create index 'test' within 2 shards. After that, execute 3 times:
After that, we have seven documents, four of them on first shard (routing 0) and three on the second shard.
Now execute query:
{
"facets": {
"value": {
"terms": {
"field": "value",
"size": 1
}
}
},
"query": { "match_all": {} },
"size": 0
}
The response is:
{
.....
"facets" : {
"value" : {
"_type" : "terms",
"missing" : 0,
"total" : 7,
"other" : 4,
"terms" : [{
"term" : 5,
"count" : 3
}
]
}
}
}
But frequency of '5' term is 4!
If I change facet size to 2, I will receive right response:
{
.....
"facets" : {
"value" : {
"_type" : "terms",
"missing" : 0,
"total" : 7,
"other" : 0,
"terms" : [{
"term" : 5,
"count" : 4
}, {
"term" : 7,
"count" : 3
}
]
}
}
}
Elastic Search engine takes [size] of most frequent terms on each shard, not on whole index. It's ok for query_and_fetch request, but when I execute query_then_fetch I expect to receive right answer.
Can you fix that bug, please?
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.