Elasticsearch System Parameter for ASYNC IO

Hello Team

I am using Elasticsearch version 7.8.0.

I understand that Elasticsearch uses ASYNC IO to perform document indexing ...

I would like to know in /etc/sysctl.conf minimum values required for ASYNC IO

aio-max-nr
aio-nr

Parameter values to be set ...

These are not defined in document ... hence I am asking here..

Could you please help me ..

Where are you getting this information from? It doesn't sound correct to me.

1 Like

Hello @DavidTurner

Reference : ES default - async or sync

Comment : 2

Elastic Team Member
Sep 2014
Hiya

OK I see where the confusion is coming in. I used the word asynchronously
in slightly different contexts there. I will try to reword in the
Definitive Guide.

Replication is sync by default, in other words: the primary waits for
indexing to happen on the replica before it returns to the user. That
said, lots of these processes happen at the same time, so sending the
document to the replica is asynchronous. It doesn't send a change then
wait for the response before sending the next one. This all happens in
parallel.

This doesn't relate to filesystem operations, it's talking about the logical operation that happens in replication.

Thanks @warkolm for your response....

This logical operation will cause same IO Operations on Disk level too ..

OR

From HEAP to Disk its SYNC IO ?

In our VM / OS .. ASYNC IO is enabled...

Yeah there's nothing about IO in those docs, which are also very old and may not be accurate any more. Elasticsearch doesn't use asynchronous IO.

Okay @DavidTurner

For my Understanding could you please correct me ...

  1. 100 of Documents are getting pushed to Elasticsearch per seconds ..

Are these documents processed into INDEX Named "my-index" serially one by one ?

  1. If Processing of Index in HEAP is done serially ... How does translogs are getting updated SYNC wise only ?

  2. Does writing to ACTUAL File on OS mount point from HEAP memory is done Serially SYNC wise or ASYNC wise ?

OR It's completely depend on OS IO Operations from JAVA HEAP to Index file on disk ?

Could you please spare some time and respond to me for my understanding ...

It will help me a lot for configuration and to proceed further in my setup ...

It depends. Each bulk is processed one-by-one on each shard, but parallel bulks are processed in parallel. This has nothing to do with async vs sync IO.

The translog is written using synchronous IO.

All files (including translogs) are written using synchronous IO.

1 Like

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.