Are objects searchable after updates?

artelk · August 28, 2017, 12:14pm

Imagine you have an object with a property X.
The initial value of the property is 13.
The object is searchable by a predicate "X > 10".
Someone modifies the property value to 42.
Do we have guarantees that the object will be searchable by the same predicate after the update is sent to Elasticsearch? The used predicate matches both property values.

As far as I know internally the updates are transformed to delete/insert pairs.
Can it be possible that the query with the predicate "X > 10" will return nothing because the old state of the object has been already deleted and the new one isn't yet indexed?

warkolm · August 29, 2017, 7:58am

Yes it is possible.

You may want to look at ?refresh | Elasticsearch Reference [5.5] | Elastic

artelk · August 29, 2017, 8:12am

Thank you Mark.
But how refresh could help?
I was talking about query made in parallel by some other client.
That client can get nothing just because the object is modifying, even if both values of X (old and new one) are greater than 10, correct? Can this be somehow solved without global lock?

warkolm · August 29, 2017, 8:14am

A refresh makes sure the document is searchable.

Wait for refresh is not a global lock. Is it the client waiting for that refresh on the shard it knows the data will exist on to make sure it gets a response.

artelk · August 29, 2017, 8:26am

Ok, writer will wait until the object is searchable.
But reader knows nothing about that and it will read in parallel.
So I need to use global lock to sync the readers and the writer (that also do refresh), right?
Or maybe there is a better solution?

artelk · August 29, 2017, 9:52am

Writer will wait for its changes to be available for searches (specifying the refresh option).
Reader should wait for writer and shouldn't request Elasticsearch until writer completes its work (including waiting until the changes are available for searches). Otherwise it is possible that the request will return nothing, even if the predicate is True for both values (old and new one). Please correct me if I'm wrong.

Reader and writer can be on different machines. So this synchronization requires some kind of distributed lock, right?
That would be great if there is some better alternative.

warkolm · August 29, 2017, 10:00am

It's not a global lock, there is no such thing in Elasticsearch.

Refreshes happen, by default, every second.

artelk · August 29, 2017, 10:09am

I mean I need to use an explicit global distributed lock in the client code to synchronize reader and writer to avoid the situation with disappearing objects on reader side. Without synchronization if the reader periodically requests for the object by that predicate it is possible that some of the requests can return nothing and on the next request the object appears again.

artelk · August 29, 2017, 10:37am

That would be the best if Elasticsearch returned an old state of the object until it is refreshed and new state is indexed.

artelk · August 30, 2017, 8:01am

A few more questions:

If some other property Y is changed will the object disappear from searching for some time?

Lets assume you established a parent/child relationship. A new child is added. Can it hide the parent and/or other children from readers for some time?

artelk · September 1, 2017, 10:32am

Guys, could you please answer. This is important for our project.

warkolm · September 1, 2017, 10:38am

If you'd like SLA based response times we will happily put you in touch with someone from our sales team. Otherwise please have patience and we will answer you when we can.

As for me, it's Friday night so hopefully someone else will pop in in the meantime

warkolm · September 7, 2017, 8:02am

No it won't.

artelk · September 8, 2017, 9:12am

That's great, thank you Mark!
Do you know if it is planned to change the Elasticsearch behavior regarding the changes of the same object? To return old state until a new one's indexing completes. I believe this is the only step left to make it usable as a real nosql storage (not only as a great searching tool).

warkolm · September 8, 2017, 9:13am

That's the only thing it can do.

artelk · September 8, 2017, 9:18am

What do you mean? The object won't be hidden from searches in the middle of updating the object properties (even if it is searched by modifying property)?

warkolm · September 8, 2017, 9:21am

The only time that can happen is if you make a request to a shard that is applying the change at the exact same time.

The chances of that, while not null, aren't that likely.

artelk · September 8, 2017, 9:48am

We have ~million documents in the storage. About 10% of them are updated during the day at arbitrary times.
Also there are several systems that search for objects by some predicates (they use scroll to fetch all the objects for which the predicate is true, then produce some reports etc.). They can do it in parallel with updates, the updates and searches are not synchronized.
The question is if it is possible that some of the objects can disappear in the report just because it is currently updating (even if for the old state of the object and for the new one the predicate is true).

warkolm · September 8, 2017, 10:07am

It's a possibility.

system · October 6, 2017, 10:08am

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Delete is real-time or not? Elasticsearch	7	1916	June 19, 2019
ES7.5: Documents not searchable after update Elasticsearch	2	565	March 3, 2020
Query filtered by _version range? Elasticsearch	2	859	July 6, 2017
Refresh=wait_for for Elasticsearch 2.2 Elasticsearch	2	492	December 28, 2017
How to update index to allow nested queries? Elasticsearch	1	227	August 16, 2022

Are objects searchable after updates?

Related topics