I have an index with 2 shards which have 1600M documents each (~3200M total), but when I calculate a histogram, the sum does not give me the total number of documents, the result is much less, what is the reason?
Please show an example of what you are seeing and the query you are running. Also provide information about which version of Elasticsearch you are using and how you determine the expected result.
Hi Christian,
ES 7.8
query:
{
"size": 0,
"aggs": {
"count": {
"date_histogram": {
"field": "last_indexed",
"interval": "year"
}
}
}
}
result:
{
"took": 4971,
"timed_out": true,
"_shards": {
"total": 2,
"successful": 2,
"skipped": 0,
"failed": 0
},
"hits": {
"total": {
"value": 10000,
"relation": "gte"
},
"max_score": null,
"hits": [
]
},
"aggregations": {
"count": {
"buckets": [
{
"key_as_string": "2022-01-01T00:00:00.000Z",
"key": 1640995200000,
"doc_count": 137538486
}
]
}
}
}
index/_count
{
"count": 3151836577,
"_shards": {
"total": 2,
"successful": 2,
"skipped": 0,
"failed": 0
}
}
Does all documents have the last_indexed
field populated?
What does this return (not tested)?
POST index/_count
{
"query": {
"bool" : {
"must_not" : {
"exists": {
"field": "last_indexed"
}
}
}
}
}
Are you using nested mappings?
What does the _cat/indices API show for this index?
Yes, all documents have the field "last_indexed".
{
"count": 0,
"_shards": {
"total": 2,
"successful": 2,
"skipped": 0,
"failed": 0
}
}
_cat/indices
{
"health": "green",
"status": "open",
"index": "xxx",
"uuid": "xxx",
"pri": "2",
"rep": "2",
"docs.count": "3152018717",
"docs.deleted": "62473287",
"store.size": "1.5tb",
"pri.store.size": "516.6gb"
}
We are not using nested documents.
This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.