ES flow

Hi ,

Can you please explain the data and control flow of index write in Elasticsearch.

Like when we send a documents, how the documents flow from DRAM to persistence device(SSD) .


It's a bit difficult to know how to answer such a broad question. Can you be more specific? What research have you already done into this question?


I need the data flow for example:

Initially New documents are collected in an in-memory indexing buffer .
Every so often, the buffer is commited:
A new segments a supplementary inverted index is written to disk.
A new commit point is written to disk, which includes the name of the new segment.
The disk is fsync’ed all writes waiting in the filesystem cache are flushed to disk, .
The new segment is opened, making the documents it contains visible to search.
The in-memory buffer is cleared, and is ready to accept new documents.

Like wise i need indexing process steps that how Elasticsearch uses heap and off heap memory.

what is the trigger point for flush. and from where flush happens is it from buffer or from translogs .

It will be great if you provide the data flow from heap to disk.

What do you need this for? What problem are you trying to solve?

This chapter describes how a shard works and is probably still largely valid even if it is a bit old.

1 Like

Thanks Christian .

I got it what i was looking for .

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.