Wrong values in _count API in ES 5.3.0?

mkelkar · April 13, 2017, 7:33pm

Hi There,
I am using ES 5.3.0. I have indexed about 2.2M documents in my index, and it seems like count API is returning results that are far too different than _cat/indices API..is there something wrong?

curl localhost:9200/_cat/indices?v
health status index      pri rep docs.count docs.deleted store.size pri.store.size
green  open   index1     6   2     172544            0    384.7mb        128.2mb
green  open   index2     6   2    2708259        74040      6.8gb          2.1gb

$ curl localhost:9200/index1/_count?pretty
    {
      "count" : 84916,
      "_shards" : {
        "total" : 6,
        "successful" : 6,
        "failed" : 0
      }
    }
 $ curl localhost:9200/index2/_count?pretty
    {
      "count" : 1027782,
      "_shards" : {
        "total" : 6,
        "successful" : 6,
        "failed" : 0
      }
    }

Thanks
Madhav.

leom · April 13, 2017, 8:26pm

Looks like you have 2 replica shards for each of your indices. I would initially guess that the _count API returns the number of primary documents, and the _cat/indices API returns total # of documents including replicas.

That would only make sense if you were still inserting documents in between you _cat/indices query and your _count queries (since 3 X 80K = 240k != 170k; but maybe when you made the _cat/indices query you actually have like 65K primary documents). Is that the case?

leom · April 13, 2017, 8:32pm

According to cat indices | Elasticsearch Guide [5.3] | Elastic

We can tell quickly how many shards make up an index, the number of docs at the Lucene level, including hidden docs (e.g., from nested types), deleted docs, primary store size, and total store size (all shards including replicas). All these exposed metrics come directly from Lucene APIs.

It does sound like the cat query returns primary + replica documents (since it's # docs at the Lucene level).

mkelkar · April 13, 2017, 9:22pm

@leom interesting read about stats coming from lucene. However, I do see that in ES 1.7, both the count and cat apis return the same value

curl localhost:9200/_cat/indices?v
health status index               pri rep docs.count docs.deleted store.size pri.store.size

green  open   &shared-apple-apple   6   2    2201729       851343     10.7gb          3.5gb

 curl 'localhost:9200/&shared-apple-apple/_count?pretty'
{
  "count" : 2201729,
  "_shards" : {
    "total" : 6,
    "successful" : 6,
    "failed" : 0
  }
}

Not sure if the semantics changed in 5.X...

system · May 11, 2017, 9:34pm

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Cat indices docs.count is way above actual document count Elasticsearch	5	5337	May 9, 2017
Why there are 20 documents diff (cat vs count)? Elasticsearch	3	251	June 19, 2022
Different values for index doc's count in kibana sense? Elasticsearch	4	3028	August 2, 2017
Understanding difference between /_cat/indices and /_search match_all Elasticsearch	2	632	July 6, 2017
Docs.count given by /_cat/indices on index with replica shards Elasticsearch	1	371	March 26, 2020

Wrong values in _count API in ES 5.3.0?

Related topics