Context Error During Reindex with Elser

Chenko · June 25, 2024, 3:55pm

Hello,

I am currently running into an error where I reindex documents from an index with a subset of data, run them through a pipeline to create vector embeddings with ELSER and create passages, which also have vector embeddings. I use a similar pipeline as this one.

The error in question looks like this:

{
  "completed": true,
  "task": {
    "node": "nodeId",
    "id": 168105,
    "type": "transport",
    "action": "indices:data/write/reindex",
    "status": {
      "total": 10000,
      "updated": 0,
      "created": 59,
      "deleted": 0,
      "batches": 59,
      "version_conflicts": 0,
      "noops": 0,
      "retries": {
        "bulk": 0,
        "search": 0
      },
      "throttled_millis": 0,
      "requests_per_second": -1,
      "throttled_until_millis": 0
    },
    "description": "reindex from [1-subset] to [1-elser]",
    "start_time_in_millis": 1719323450496,
    "running_time_in_nanos": 1067795555450,
    "cancellable": true,
    "cancelled": false,
    "headers": {
      "trace.id": "3b984be232f7068413ddfb716b77fc02"
    }
  },
  "error": {
    "type": "search_phase_execution_exception",
    "reason": "all shards failed",
    "phase": "query",
    "grouped": true,
    "failed_shards": [
      {
        "shard": -1,
        "index": null,
        "reason": {
          "type": "search_context_missing_exception",
          "reason": "No search context found for id [14257]"
        }
      }
    ],
    "caused_by": {
      "type": "search_context_missing_exception",
      "reason": "No search context found for id [14257]"
    }
  }
}

I recognize this error, a while back in my local docker container I had the same error when trying to reindex documents that took too long. I had thought it was my local hardware, however this time I am trying it on a cloud trial and I still run into this error.

Initially I thought the problem was that I had put the size at too much. This did fix the issue. Now the error is back even though I put the size on 1.

My Reindex request:

POST _reindex?wait_for_completion=false
{
  "source": {
    "index": "Index1",
    "size": 1
  }, 
  "dest": {
    "index": "Index2",
    "pipeline": "chunker-elser-v2"
  }
}

The reindex request is pretty simple, put the size on 1, and add the pipeline.

My Pipeline:


PUT _ingest/pipeline/chunker-elser-v2
{
  "processors": [
    {
      "script": {
        "description": "Chunk content into sentences by looking for . followed by a space",
        "lang": "painless",
        "if": "ctx.content != null && !ctx.content.isEmpty()",
        "source": "\n String[] envSplit = /((?<!M(r|s|rs)\\.)(?<=\\.) |(?<=\\!) |(?<=\\?) )/.split(ctx['content']);\n ctx['passages'] = new ArrayList();\n        int i = 0;\n        boolean remaining = true;\n        if (envSplit.length == 0) {\n          return\n        } else if (envSplit.length == 1) {\n          Map passage = ['text': envSplit[0]];ctx['passages'].add(passage)\n        } else {\n          while (remaining) {\n            Map passage = ['text': envSplit[i++]];\n            while (i < envSplit.length && passage.text.length() + envSplit[i].length() < params.model_limit) {passage.text = passage.text + ' ' + envSplit[i++]}\n            if (i == envSplit.length) {remaining = false}\n            ctx['passages'].add(passage)\n          }\n        }\n        ",
        "params": {
          "model_limit": 400
        }
      }
    },
    {
      "foreach": {
        "field": "passages",
        "processor": {
          "inference": {
            "model_id": ".elser_model_2",
            "input_output": {
              "input_field": "_ingest._value.text",
              "output_field": "_ingest._value.vector.predicted_value"
            },
            "on_failure": [
              {
                "append": {
                  "field": "_source._ingest.inference_errors",
                  "value": [
                    {
                      "message": "Processor 'inference' in pipeline 'chunker-elser-v2' failed with message '{{ _ingest.on_failure_message }}'",
                      "pipeline": "ml-inference-title-vector",
                      "timestamp": "{{{ _ingest.timestamp }}}"
                    }
                  ]
                }
              }
            ]
          }
        },
        "if": "ctx.passages != null"
      }
    },
    {
      "inference": {
        "if": "ctx.title != null && !ctx.title.isEmpty()",
        "model_id": ".elser_model_2",
        "input_output": {
          "input_field": "title",
          "output_field": "ml.title.vector.predicted_value"
        },
        "on_failure": [
          {
            "append": {
              "field": "_source._ingest.inference_errors",
              "value": [
                {
                  "message": "Processor 'inference' in pipeline 'ml-inference-title-vector' failed with message '{{ _ingest.on_failure_message }}'",
                  "pipeline": "ml-inference-title-vector",
                  "timestamp": "{{{ _ingest.timestamp }}}"
                }
              ]
            }
          }
        ]
      }
    }
  ]
}

As mentioned before the pipeline is based on this example from Elastics blog. However I modified it slightly.

Would increasing the scroll time be a viable solution?

Any Idea how to fix this would be appreciated!

I have also seen this blog popup which is awesome!

Sean_Story · June 25, 2024, 4:12pm

hi @Chenko ,

Have you tried using the scroll query parameter for the Reindex API?

This should allow you to increase how long the search context sticks around.

You may also want to check to make sure you're not flooding the inference queue and getting inference errors. You can manually throttle your reindex operation with requests_per_second, but hopefully that will not be necessary.

Chenko · June 25, 2024, 4:36pm

Hi Sean,

Thanks for the swift response!

Yes I was thinking of increasing the time the search context sticks around. I am however not sure about what I should be raising this to, IIRC. 5 minutes is the default and would seem to be plenty to just embed 1 document.

I assume this would not be a problem since I am limiting the size to 1 in the reindex call.

PS. The documents could have a lot of passages however it should not be an insane amount of them. (I can currently not think of a way to get the maximum amount though)

Chenko · June 25, 2024, 4:46pm

Would it perhaps be a better idea to instead of reindex, do an update_by_query, with a match_all. Or would this provide the same result?

Currently I have modified my reindex query to now have the following:

POST _reindex?wait_for_completion=false&refresh=true&scroll=30m
{
  "source": {
    "index": "Index1",
    "size": 1
  }, 
  "dest": {
    "index": "Index2",
    "pipeline": "chunker-elser-v2"
  }
}

Hopefully the refresh and scroll time increase make a difference!

Sean_Story · June 25, 2024, 5:13pm

Yeah that's a fair point. Again, I'd suggest looking to see if you have inference errors. Its possible that your ML node or your model crashed.

I'd expect the same result eventually. Reindex should be the more efficient of the two.

Chenko · June 25, 2024, 7:04pm

Thanks, I upped the scroll time as shown above, now it seems to not throw any errors, however if my calculations are right, 10.000 documents would take another 35 hours to embed so I will have to wait a bit to see the actual result.

Topic		Replies	Views
Reindex failed: search_context_missing_exception Elasticsearch reindex	1	2485	March 2, 2021
Elasticsearch reindex search_context_missing_exception Elasticsearch reindex	2	1012	January 3, 2022
Reindex failing with SearchContextMissingException: No search context found for id (batch size = 1) Elasticsearch	1	492	May 20, 2020
"No search context found for id" during reindex Elasticsearch	2	4001	December 6, 2019
Issue while running a pipeline Elasticsearch ingest-pipeline	2	279	January 15, 2024

Context Error During Reindex with Elser

Related topics