Is there a way to delay writing to replicas while indexing?
When indexing data continuously, can we reply to the client after writing only to the primary shard, and replica can be written after a while?
Things to note:
- Replicas cannot be reduced to zero. Since data is continuously coming (not initial data population)
- I've looked into index.translog.durability: async and increasing the index.translog.sync_interval
Another question is does increasing index.translog.sync_interval means it will put data in the primary shard (node) cache and flush to disk for both primary and replica after this interval, or will data be stored in the cache of primary and replica nodes?
Why do you want to do this?
To increase the indexing performance of one index.
That is exactly the default behavior of elasticsearch on index operations.
By default, write operations only wait for the primary shards to be active before proceeding (i.e.
wait_for_active_shards=1 ). This default can be overridden in the index settings dynamically by setting
index.write.wait_for_active_shards . To alter this behavior per operation, the
wait_for_active_shards request parameter can be used.
Now that IS different that setting replicas to 0 while indexing...
The difference is that resources will still be used to replicate the data to the replica shards
That is why sometimes it makes sense to set replicas to 0 when indexing... perhaps that is what you are thinking of...
But then you say this...
Which I do not understand ... data coming in continuously is not a requirement to having replicas...
All that said I would look at How to Tune for Indexing Speed and start with that
This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.