Data race condition in automaton queries

Hello,

In a local unit test involving a runtime field and a regexp query with ES 8.12.1, I have experienced search inconsistencies with the result count. The query shown below uses the script parity which returns even or odd depending on the value of another numeric field. It returns sometimes an unexpected number of hits.

{
  "query": {
    "regexp": {
      "outer_parity": {
        "value": "e.e."
      }
    }
  },
  "runtime_mappings": {
    "outer_parity": {
      "type": "keyword",
      "script": {
        "lang": "painless",
        "source": "parity"
      }
    }
  }
}

I believe that following this change that enabled parallelization by default, the queries extending AbstractStringScriptFieldAutomatonQuery have a data race condition. If I understand that change correctly, then the code below can be called concurrently:

Therefore, the BytesRefBuilder scratch below is shared by all threads that execute a search on different segments, which would lead to the race condition I am seeing. With the query shared above, the scratch variable would contain even for example, although the values list passed in argument contains only odd. The race condition would explain this inconsistency.

What do you think ?

2 Likes

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.