Different responses for search with terms query and get document by id

Here is the scenario: app is running an integration test and is spinning up docker container using official docker image, version 8.7.0. When it’s up nd running, test doest its setup (it is using Spring Data Elasticsearch library, but I’ve anbled request tracing to see what exactly gets sent to ES). It creates index with custom mappings (all fields used later in search are defined as keywords), then goes on with a test case, which involves creating two documents, updating them and making sure they are actually updated. Both creation and updating of documents goes as expected, meaning app issues /index/_bulk?refresh=false requests which produce 200 OK response. When creating index we use index.refresh_interval with value of 1s, so its expected, that right after bulk request, the data won’t be visible untill refresh happens. So we do a pooling with Awaitility, getting document by its id and checking its fields to see if it was updated.

The problem is that when getting the document by id (GET /$index/_doc/$id), at some point it is evident, that refresh took place, since we see all expected changes withing the document. When that happens we do a search (POST /$index/_search?search_type=query_then_fetch&typed_keys=true) with query that looks like this:

```
{
  "aggregations":{
    "totalSum":{"sum":{"field":"anountField"}}
  },
  "from":0,
  "query":{
    "bool":{
      "must":[
        {"terms":{"idField":["2","1"]}},
        {"term":{"unchangedField":{"value":"false"}}}
      ]
    }
  }
  ,"size":10
  ,"sort":[
    {"sortField1":{"mode":"max","order":"desc"}},
    {"sortField2":{"mode":"max","order":"desc"}}
  ],
  "track_scores":false,
  "track_total_hits":2147483647,
  "version":true
}
```

The search request returns expected documents (since we asked for them by ids), however surprisingly their contents slightly different from values we see when getting them individually by ids. To further expand on “slightly”: update changes two fields, giving each new value. When we do get document by id, both fields have expected value. When we do search document has field1 with expected value and field2 with unexpected (namely: previous) value.

When we do the most crude approach and after issuing those bulk requests we do simply Thread.sleep(1000), then everything works just fine. I still feel like this is not optimal approach. I’m aware of wait_for_refresh, but it’s not really usefull in my case. So the question is: why does this difference happen? is it expected to happen? Perhaps it is something wrong with my query?

Welcome to the forum @jprucia !!

For a first time poster, you’ve written an excellent post. Very clear and precise.

I think you have simply misunderstood what “refresh=” means here. refresh=false pretty much means the bulk call should return once the docs are indexed, but you do NOT need those specific documents to be returned by any search yet. If you do need those updates to be returned by a search right away, then don’t use refresh=false, and be prepared to wait a bit longer.

You will always get the most recent version of a document by simply looking up its _id.

But a search is not a lookup.

The index refeesh_interval is related but not the same. It’s not a guarantee. Put simply, refresh_interval of 1s does NOT mean an index will always refresh every second. Closer would be “please try and refresh active shards every second, using your best efforts”.

You have effectively described the expected behaviour.

This is erroneous. You can’t know refresh took place via this test.

1 Like

Thank you for your reply.

I think my understanding of refresh parameter was correct. Also my understanding of refresh interval most probably was correct (I kinda assumed in some cases ES might come a little late).

What I did not understand correctly was the fact that “lookup is not search”.

After learning this i ditched the lookup approach and replaced it with a simple search. Basically a single term query fetching document with given id, no aggregates, no sorting and other fancy stuff. When done that way and supplemented with Awaitility waiting at least 5 seconds it all works as expected. Assumption here beeing that if I index 2 documents, and then I wait untill one of that documents has proper data when searching, then surely index has been refreshed. I really hope this expectation is ok, at least for simple integration tests :slight_smile:

Thank you again as I think you clarified very important point, that wasn’t clear for me.

if I’ve parsed this correctly:

then the assumption seems not 100% safe. Background refresh is done at shard, not index, level.

So if you index doc1 then doc2, and doc2 shows that recent update in a subsequent search, then the specific shard containing doc2 has been refreshed. But it guarantees you nothing about doc1, as doc1 is maybe on a different shard.

In passing, “proper data” is a curious construction that I’d personally suggest to avoid due to some ambiguity. From context, I presume you mean “most recent data”.

My assumption is probably 100% safe, because of the environment I’m running in. As I’ve mentioned at the begining: we are talking about a single docker container running official ES image. The only code talking to this particular node is an integration test, which is a single thread doing just a bunch of requests in a sequential manner.

I’m aware, that this is a special setup and I would not expect such behaviour from live deployment with multiple shards and so on.

That being said I was still able to fix my problem. Many thanks for your answers!

The current approach depends on an implementation detail (single shard) rather than a guarantee. IMHO that makes it fragile by definition.

It also creates a knowledge trap:—there’s nothing in the system itself preventing someone from changing the shard count and introducing subtle, hard-to-diagnose bugs.

Also, “probably 100% safe” is a contradiction. Either something is guaranteed by design, or it’s relying on assumptions. In this case, it’s clearly the latter.

My cars brakes work decent, well unless I drive faster than 60. After that I know they’re a bit dodgy, but I’m 100% safe because I’ll never drive faster than 60 !? :rofl:

1 Like