Talking about Elasticsearch version 7.17
This is regarding how flush API and index.translog.durability settings are connected to each other.
My understanding is that if this index.translog.durability is set as Request
then data for each indexing is first written to translog , then committed to disk and then response is returned to client, and then translog data is deleted.
Is it correct or i am missing something here?
You're correct.
You almost certainly don't need to call the flush API yourself. As per its docs:
Elasticsearch automatically triggers flushes as needed [...] It is also possible to trigger a flush on one or more indices using the flush API, although it is rare for users to need to call this API directly.
@DavidTurner Some more context, curious to see if answer is still same:
We are removing a data node from running cluster, and there we were thinking of using Flush incase some translog data is not synced to Disc.
I guess this makes sense if index.translog.durability is set to Async
But we have default configuration for index.translog.durability as Request. which means translog is synced before responding to request.
Hence thought of double checking.
Let me know your views based on above context.
The flush API has nothing to do with syncing the translog to disk, it's for flushing the index structures.
But as per its docs you shared, it talks about data:
" Flushing a data stream or index is the process of making sure that any data that is currently only stored in the transaction log is also permanently stored in the Lucene index."
Exactly. The data is in the translog before it's flushed.
So you mean Index structure changes data is in translog and not the index data ?
Also, what are the scenarios where users call this API?
For Rolling Restart, Below doc suggests calling Flush API.
It might help in a rolling restart but it's not required, the data is safe on disk in the translog already.
Please note that this node is going to be out of cluster. And even if data is there in translog on this node, it does not guarantee that data has been replicated to other nodes of cluster.
Here is the sequence of events:
Index request comes to a data node.
Data is written in translog but not synced to segments/shards on disc.
Translog setting was Async, so it will sync at regular interval.
Before next interval comes, this node is down and out of cluster.
Now data which was not synced, is lost.
When node comes back to cluster later, cluster will consider this translog data as stale.
This is the data loss i was trying to highlight. Let me know if i missed anything.
Do you not have replica shards configured for your indices?
Yeah, what Christian said. Flushing is nothing to do with replication.
There are 3 replicas configured for index shards. Its cluster deployed in production with TBs of data coming everyday.
I guess what you mean by above line is: Flushing is processing of pushing data to disc in Segments/Shards, and replication is step after that.
What i am curious about is : How data consistency is taken care of in Prod clusters when one node goes down, and translog durability is configured as Async.
No, there's no dependency between the steps. Replication generally happens first, but not always.
If you have replicas and your cluster health is green then all acknowledged write operations are successfully written to all replicas and therefore won't be lost if you lose a node.