Inconsistency in no of total hits in results

Hi All,

I am using below query for one of the our use cases & we are observing weird thing on total hits count - means assume I hit the below request first time then total hits in result would be 64 and if I hit second time then it is 47, again if I hit third time then result is 64 & fourth time it would be 47.. and so on. So can anyone please help me how can we solve this issue.

{

"query": {
    "bool": {
        "filter": [
            {
                "match_phrase_prefix": {
                    "field1": {
                        "query": "D"
                    }
                }
            },
            {
                "match_phrase_prefix": {
                    "field2": {
                        "query": "66"
                    }
                }
            }
        ]
    }
},
"aggs": {
    "result": {
        "terms": {
            "field": "field1.keyword",
            "order": {
                "_term": "asc"
            },
            "size": 30
        }
    }
}

}

Which version are you running?

I saw something like this in the past with a very old version where I suffered from a split brain issue and the shards (primaries and replicas) were not really in sync. I had to rebuild my indices to fix that issue.

Very latest version - 7.x and I have rebuilt the indexes thrice but no luck.

Could be because the query you are using is a somewhat random grab-bag of terms found in the index (see max expansions part of the docs).

When you hit one replica it might not have merged out some terms from deleted docs in the index and so the set of 50 expansion terms used by one replica may differ to the set used by the other replica.
If you try a more straight-forward query e.g a match query results should be more deterministic.

But as per my requirement I have to use Match Phrase Prefix query only so please suggest how can I solve this issue ?

Several options:

  • Pass a routing preference in the request to ensure your client consistently visits only one of the replicas
  • Force merge the indices down to 1 segment (only useful if you have no more additions/updates).
  • Have no replicas (only useful if you don't like resiliency)
  • Run a preliminary query to discover the terms you want to use in a follow-up query rather than leaving a replica's selections to chance.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.