Elasticsearch will return a decreased 'hits total' value

Hi community
I have a flink job which will consume Kafka message and load into Elasticsearch. In order to monitor the number of documents loaded into Elasticsearch, I write a DSL to get the 'hits total' in Elasticsearch,

GET some-indice/_search
{
   "track_total_hits":true
}

Another way is

GET some-indice/_count
{
}

Normally, this will give me an increased 'hits total' value due to stream data . However, When I execute this script frequently, I got a decreased value than previous query occasionally. Is this a normal behavior that Elasticsearch will return a decreased 'hits total' value occasionally? There is no any delete work on Elasticsearch.

Hey,

wild guess here without any further info: Is this index being constantly indexed into? If so, each of your searches will hit different shards (somtimes the primary shard, sometimes a replica). Each of those shards refreshes its data at different times, so you might end up with different counts.

Hope this helps as a start.

You can also run POST my-index/_refresh and then check via the _cat/shards API if all shards have the same amount of documents - if there is no concurrent indexing going on, otherwise the above counts again.

Also, please always specify your Elasticsearch version. Thanks!

--Alex

1 Like

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.