Could someone clarify the index refresh behavior. I understand and see
in my app and tests that the writes are made visible to my (currently)
single node app at 1 second intervals.
My app will be a multi-node cluster with each app node also acting as
an embedded ES.
Question is: Will an update to master node A be made available to the
replicas within that 1 second refresh period, or is this only true for
the local node?
We have https://github.com/elasticsearch/elasticsearch/issues/1063
where there is talk of having blocking calls where a write will only
come back when it is visible everywhere. Is that possible or will that
only be possible in a single node architecture?
When a document is indexed, it goes through the primary shard, and
replicated to the replica shards. The master node plays no part here.
Each shard (primary and replica) have a refresh interval, once a refresh is
executed, then the operations done against it from the last refresh are
visible for search. Note, get has "realtime" visibility.
Could someone clarify the index refresh behavior. I understand and see
in my app and tests that the writes are made visible to my (currently)
single node app at 1 second intervals.
My app will be a multi-node cluster with each app node also acting as
an embedded ES.
Question is: Will an update to master node A be made available to the
replicas within that 1 second refresh period, or is this only true for
the local node?
No, indexing data means they get indexing in the relevant nodes / shards.
Refresh interval is a agnostic to that.
That would be really awesome and would simplify a lot of use cases.
Such as updating an edit screen and being able to navigate and
immediately update the associated list screen and see the results of
the update consistently. An indexAndBlockUntilReplicatedAndVisible
flag in the request would be great.
When a document is indexed, it goes through the primary shard, and
replicated to the replica shards. The master node plays no part here.
Each shard (primary and replica) have a refresh interval, once a refresh is
executed, then the operations done against it from the last refresh are
visible for search. Note, get has "realtime" visibility.
Could someone clarify the index refresh behavior. I understand and see
in my app and tests that the writes are made visible to my (currently)
single node app at 1 second intervals.
My app will be a multi-node cluster with each app node also acting as
an embedded ES.
Question is: Will an update to master node A be made available to the
replicas within that 1 second refresh period, or is this only true for
the local node?
No, indexing data means they get indexing in the relevant nodes / shards.
Refresh interval is a agnostic to that.
We havehttps://github.com/elasticsearch/elasticsearch/issues/1063
where there is talk of having blocking calls where a write will only
come back when it is visible everywhere. Is that possible or will that
only be possible in a single node architecture?
If its implemented, we should be able to implement it for a multi node case
as well.
I see. So the end result is the same, but forcing the refresh is disruptive
to performance.
We could definitely benefit from this functionality also. We use refresh
now, but our writes are minimal (at the moment) so we don't notice much
impact.
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.