Elasticsearch 7.17.1 - shards skipped even if they contain data

mihail-ca · June 18, 2024, 1:48pm

Hello,

We have the following document in Elasticsearch 7.17.1:

{
  "_index": "operate-list-view-8.1.0_2024-06-13",
  "_type": "_doc",
  "_id": "2251800712733897",
  "_score": 1.0,
  "_source": {
    "id": "2251800712733897",
    "key": 2251800712733897,
    "partitionId": 1,
    "processDefinitionKey": 2251800347125543,
    "state": "COMPLETED",
    "incident": false,
    "joinRelation": {
      "name": "processInstance",
      "parent": null
    },
    "processInstanceKey": 2251800712733897
  }
}

We want to search for this document, and we use the following request:

POST http: //localhost:9200/operate-list-view*/_search?pretty=true 
{
  "size": 50,
  "query": {
    "constant_score": {
      "filter": {
        "bool": {
          "must": [
            {
            "term": {
              "joinRelation": {
                "value": "processInstance",
                "boost": 1.0
              }
            }
          },
            {
              "terms": {
                "id": [
                  "2251800712733897"
                ],
                "boost": 1.0
              }
            }
          ],
          "adjust_pure_negative": true,
          "boost": 1.0
        }
      },
      "boost": 1.0
    }
  }
}

As you can see, the document matches the request. However, the response we get is:

{
    "took": 4,
    "timed_out": false,
    "_shards": {
        "total": 91,
        "successful": 91,
        "skipped": 90,
        "failed": 0
    },
    "hits": {
        "total": {
            "value": 0,
            "relation": "eq"
        },
        "max_score": null,
        "hits": []
    }
}

So no documents found, and for some unknown reason, 90 shards are skipped.

If we use the same request, but just remove the joinRelation filter:

{
  "term": {
    "joinRelation": {
      "value": "processInstance",
      "boost": 1.0
    }
  }
}

Then the document is found (because no shards are skipped in this case):

{
  "size": 50,
  "query": {
    "constant_score": {
      "filter": {
        "bool": {
          "must": [
            {
              "terms": {
                "id": [
                  "2251800712733897"
                ],
                "boost": 1.0
              }
            }
          ],
          "adjust_pure_negative": true,
          "boost": 1.0
        }
      },
      "boost": 1.0
    }
  }
}

Response:

{
  "took": 29,
  "timed_out": false,
  "_shards": {
    "total": 91,
    "successful": 91,
    "skipped": 0,
    "failed": 0
  },
  "hits": {
    "total": {
      "value": 1,
      "relation": "eq"
    },
    "max_score": 1.0,
    "hits": [
      ...
    ]
  }
}

So the questions are:

Why are 90 shards skipped in the first case, even if the document clearly exist ? Where should we look for to see what could be the cause of skipped shards ? The documentation is pretty vague about this: _shards.skipped : Number of shards that skipped the request because a lightweight check helped realize that no documents could possibly match on this shard.
Is there a way to force Elasticsearch not to skip shards, via cluster settings or at least at request level ?

Thank you

Topic		Replies	Views
Shards skip on range query Elasticsearch	1	1322	October 12, 2018
Query randomly returns empty and right results Elasticsearch	4	3260	April 28, 2019
Definition for "skipped" in "_shards" as query result Elasticsearch	3	1850	January 10, 2019
How to count up "skipped shards" in "_shards" as query result Elasticsearch	2	1359	January 10, 2019
Search request not returning documents anymore Elasticsearch	6	2611	December 3, 2018

Elasticsearch 7.17.1 - shards skipped even if they contain data

Related topics