Why there are 20 documents diff (cat vs count)?

asil · May 22, 2022, 4:11am

I try to compare results of count and cat for the amount of documents and I'm getting different results:

GET /my_index/_count gives:

{
   "count": 4020199,
   "_shards": {
      "total": 1,
      "successful": 1,
      "skipped": 0,
      "failed": 0
   }
}

And GET /_cat/indices gives:

green open my_index 53IuNvZ1T4W_8u1U4bf7kb 1 1 8040418 0 27gb 13.5gb

When we compare the quantities we get that:

8040418 - 4020199 * 2 (one replica) = 20

Note: I used refresh command and there is no change.

So where are that 20 loss documents ?

stephenb · May 22, 2022, 3:34pm

Hmm you have something odd going on...

These 2 commands should return the exact same number you do not need to multiply the count by number of replicas etc... the _count returns the number of document's irregardless of the replicas.

I picked a random index not the document count is 10034437 in both.

GET /_cat/indices/filebeat-7.15.2-2022.05.16-000168?v

health status index                             uuid                   pri rep docs.count docs.deleted store.size pri.store.size
green  open   filebeat-7.15.2-2022.05.16-000168 qIO3tHowQtyIytZmqC62Lw   1   1   10034437            0      8.5gb          4.2gb

GET /filebeat-7.15.2-2022.05.16-000168/_count

{
  "count" : 10034437,
  "_shards" : {
    "total" : 1,
    "successful" : 1,
    "skipped" : 0,
    "failed" : 0
  }
}

Christian_Dahlqvist · May 22, 2022, 3:50pm

Can you share the output of the cat shards API for the index as well as the mappings for it? Are you by any chance using nested mappings?

If I remember correctly the count API returns the number of documents irrespective of the number of nested documents while the cat API includes nested documents as these count against the lucene shard limit. If this is the case and your documents mostly have 1 nested document but some have a higher or lower value, that may explain the difference.

system · June 19, 2022, 3:50pm

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Understanding difference between /_cat/indices and /_search match_all Elasticsearch	2	632	July 6, 2017
Wrong values in _count API in ES 5.3.0? Elasticsearch	4	1073	May 11, 2017
Cat indices docs.count is way above actual document count Elasticsearch	5	5337	May 9, 2017
Doc count match Elasticsearch	4	690	July 20, 2019
Different values for index doc's count in kibana sense? Elasticsearch	4	3030	August 2, 2017

Why there are 20 documents diff (cat vs count)?

Related topics