Elasticsearch returns 10000 rows even when only ~100 documents are relevant

Hello, i need some help with the following issue.
I run wildcard queries on an index in Kibana Dev Tools. If i run one query at a time it returns only the relevant hits. If the queries are run paralell (2 browser tabs), both return 10000 hits.
It is true for any of the queries below, looks it is related to the demand on Elasticsearch, or to something i could'n find until now in the documentations. The same is true if the queries are run by our application via Java API.
The index has 6 shards, 2 replicas but the same happens with 1 shard and no replicas, too.
Any help is appreciated because our client is a little bit crabby for the many irrelevant hits.
Thank you, Attila

GET someindex*/_search
{
  "query": {
    "wildcard": {
      "somefield": "*sometext*"
    }
  }
} 

GET someindex*/_search
{
  "query": {
    "bool": {
      "must": [
        {
          "wildcard": {
            "somefield": "*sometext*"
          }
        }
      ]
    }
  }
}

GET someindex*/_search
{
  "query": {
    "bool": {
      "minimum_should_match": 1, 
      "should": [
        {
          "wildcard": {
            "somefield": "*sometext*"
          }
        }
      ]
    }
  }
}

Hi @bonyolult,
You can add size as a query parameter to get 7 results for example with:

GET someindex*/_search?size=7

You could also add size in the body with:

GET someindex*/_search
{
  size: 7,
  "query": {

Hi Andrew, thank you for your answer! We are dealing with enermous amount of data. A query can return hundreds of thousands of hits and all those must be shown to the user paginated (that's the business req., i can't help:)). so the first thing is to count how many hits we can expect. In this case i can't use the size parameter. When search_after is used than of course we use the size setting. The other thing what i don't understand is: how come, a query returns the e.g. 103 relevant hits and on the second run (when other queries are running, too) it returns everything. AFAIK it's clearly related to the demand on Elasticsearch. When a query runs alone on the cluster it's fine, returns ony what it has to.

Gotcha. Hmmm, I wouldn't expect the number of primaries or replicas to matter in the issue you are seeing. Just curious what happens when you run queries in parallel but on different clients? For example, instead of running it in two browser tabs on the same machine, trying two different machines each with one tab.

The reason was a query_string query without proper escaping, so when the user searched for "something" with wildcard, in the generated query there was
"*\"something\"*" instead of "*something*" and this caused some strange and undeterministic behaviour.

The other thing related to the queries shown in the original post: it seems to be a Kibana bug because running the same searches in Postman return the expected hits.

Oh good to know. Strange about postman being different than kibana though.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.