Count query is returning inaccurate results

Count query (filter term) is returning inaccurate results - */_count

We have tried to get the count using a regular search query using this approach - "size":0,"track_total_hits":true and the count is still inaccurate.

Basically, the counts are the same in both the above queries.

In both the above cases, this is the _shards response -
"_shards":{"total":1,"successful":1,"skipped":0,"failed":0}

We are running v7.13 on the Elastic Cloud. Our index is about ~35m documents. We are running a Staging & Production server on the Cloud (with the same data) and the counts are invalid on both the servers.

We don't do any real-time continuous updates to the index - no updates, no deletes. We add all the ~35m documents via bulk updates using background jobs initially, and no more updates after that.

How do we know the counts are inaccurate - well we were so confused by the inaccuracies, we looped through each document using scroll API to run a loop and do a manual count, and indeed our index/data looks ok, but Elasticsearch, unfortunately, is returning the wrong counts.

How do we solve this issue?

What does the output of the cat indices API look like? Have you refreshed after you finished bulk loading, especially if you optimized by changing the refresh interval during loading? What does the full queries and results look like?

What do you get if you run GET /<PUT INDEX PATTERN HERE>/_count?q=*:* ?

@Christian_Dahlqvist Thanks for your response.

  • We did not change or optimize the refresh interval during loading

  • Just ran POST /_refresh and it came back under a second with the following response (but the counts did not change)

      {
        "_shards": {
          "successful": 43,
          "failed": 0,
          "total": 46
        }
      }
    
  • Following is the output of cat indices
    yellow open products_20210523141120879 Y76XjK2OTbeE4i_8EI6o6g 1 1 37142644 49 59.4gb 59.4gb

  • Following is the response of the /_count?q=*:* ?

      {
        "count": 37142644,
        "_shards": {
          "successful": 1,
          "failed": 0,
          "skipped": 0,
          "total": 1
        }
      }

This is the number of documents the index contains. Are you expecting a different number?

If you are expecting a different number, how did you index the data? If you used bulk requests, did you check that all writes were successful?

@Christian_Dahlqvist that total count 37142644 for the index looks about ok.

Also, we have retries built in our background jobs, so I dont think we run into any problems when doing bulk writes.

What is then giving inaccurate results?

The inaccuracies are when we run a filter query like this

{"query":{"bool":{"filter":[{"terms":{"level_2_classification_ids":[261243]}}]}},"size":0,"track_total_hits":true}

What result do you get and what are you expecting? What is the mapping for the field you are searching on?

If you are looking for the count of documents that match a filter criteria I would recommend to use an aggregation.

  • What we get is 199,421 and what we are expecting is 202,164, and some of the filter queries on other ids are even way off

  • This is the mapping for that field...

    "level_2_classification_ids": {
              "type": "long"
            },
  • We are using v7.13 (its the latest & greatest)

I vaguely recall there was some optimization introduced recently that improved performance by cutting the query short, which could result in inaccurate counts. Haven't found the blog post though yet, so could very well be mistaken. I would recommend trying an aggregation with a filter and see if that gives accurate results.

I wonder if the track_total_hits parameter mentioned here makes a difference. Can yoiu try setting this to true?

Yea I am running the search query with track_total_hits true, you need that for queries that return counts greater than 10,000

Also, /_count query is returning same inaccurate results.

Hmm, ok will try with aggregations.

So apparently, aggregations also return the same inaccurate counts. Here's the aggs query...

{
  "aggs": {
    "my-agg-name": {
      "terms": {
        "field": "level_2_classification_ids",
        "size": 1,
        "include": [
          261243
        ]
      }
    }
  },
  "size": 0
}

How are you deducing that the result is wrong?

Using the above approach.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.