[PLEASE_HELP] Consistent searching with search_after

Hello, I need help making one of our queries consistent. We currently make a query that handles a big amount of documents. The problem I'm having is with an index that have documents that literally have the same properties and values except for its name and a custom field representing and id. The name changes following this pattern "api-flights-{number}". The variable number goes from 1 to 330. We are dividing this query in N sub-queries following the initial query with search_after. The problem is that when we search, the first request comes with the correct amount of documents and with the correct ordering, but when we grab the sorting values of the last element and we request again the same query with the search_after, we are seeing that sometimes the following page comes correctly, but other time it comes the exact page that the previous query returned me, but with different sorting values (always lower in score). Moreover, when I try following again the page, almost always I end up getting the last page of this 330 elements or sometimes another page. I tried changing the search_type to dfs_query_then_fetch and also adding preference with the name of the request-id that is currently being executed (also tried the search_type with and without the preference. In all it's combinations). I can't manage to get all the elements elastic said it have in the total-hits (When we trigger the initial query, in the response elastic says that the total amount of hits is 330). I'm using Elasticsearch 7.4. We are searching and sorting by score and by a custom field we have that is keyword field that have unique values. I'm really lost and don't know that else to try. Please if someone knows what is happening here, I would really appreciate it.

This is the query we are making:

Query number 1:

{
  "size": 20,
  "query": {
    "bool": {
      "must": [
        {
          "multi_match": {
            "query": "api-flights",
            "fields": ["*"],
            "type": "bool_prefix",
            "operator": "OR",
            "slop": 0,
            "prefix_length": 0,
            "max_expansions": 50,
            "zero_terms_query": "NONE",
            "auto_generate_synonyms_phrase_query": true,
            "fuzzy_transpositions": true,
            "boost": 1.0
          }
        }
      ],
      "adjust_pure_negative": true,
      "boost": 1.0
    }
  },
  "explain": false,
  "_source": {
    "includes": ["*"],
    "excludes": []
  },
  "sort": [
    {
      "_score": {
        "order": "desc"
      }
    },
    {
      "node.@id": {
        "order": "asc"
      }
    }
  ],
  "highlight": {
    "fields": {
      "*": {}
    }
  }
}

Rest of the queries

{
  "size": 20,
  "query": {
    "bool": {
      "must": [
        {
          "multi_match": {
            "query": "api-flights",
            "fields": ["*"],
            "type": "bool_prefix",
            "operator": "OR",
            "slop": 0,
            "prefix_length": 0,
            "max_expansions": 50,
            "zero_terms_query": "NONE",
            "auto_generate_synonyms_phrase_query": true,
            "fuzzy_transpositions": true,
            "boost": 1.0
          }
        }
      ],
      "adjust_pure_negative": true,
      "boost": 1.0
    }
  },
  "explain": false,
  "_source": {
    "includes": ["*"],
    "excludes": []
  },
  "sort": [
    {
      "_score": {
        "order": "desc"
      }
    },
    {
      "node.@id": {
        "order": "asc"
      }
    }
  ],
  "search_after": [
    2,12323,
    "https://host/api-flights-189/"
  ],
  "highlight": {
    "fields": {
      "*": {}
    }
  }
}
1 Like

the search_after mechanism is not stable, if you keep indexing/updating/deleting documents.

if you need stable searches take a look at the Point in Time API .

The thing is that this query happens in a couple of seconds and the index doesn't receive any updates/deletes or indexing. The only way to make the search_after stable is by using Point in Time API?

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.