Elasticsearch System Parameter for ASYNC IO

tusharnemade · September 21, 2021, 4:31pm

Hello Team

I am using Elasticsearch version 7.8.0.

I understand that Elasticsearch uses ASYNC IO to perform document indexing ...

I would like to know in /etc/sysctl.conf minimum values required for ASYNC IO

aio-max-nr
aio-nr

Parameter values to be set ...

These are not defined in document ... hence I am asking here..

Could you please help me ..

DavidTurner · September 21, 2021, 7:40pm

Where are you getting this information from? It doesn't sound correct to me.

tusharnemade · September 22, 2021, 2:59am

Hello @DavidTurner

Reference : ES default - async or sync

Comment : 2

Elastic Team Member
Sep 2014
Hiya

OK I see where the confusion is coming in. I used the word asynchronously
in slightly different contexts there. I will try to reword in the
Definitive Guide.

Replication is sync by default, in other words: the primary waits for
indexing to happen on the replica before it returns to the user. That
said, lots of these processes happen at the same time, so sending the
document to the replica is asynchronous. It doesn't send a change then
wait for the response before sending the next one. This all happens in
parallel.

warkolm · September 22, 2021, 3:44am

This doesn't relate to filesystem operations, it's talking about the logical operation that happens in replication.

tusharnemade · September 22, 2021, 4:16am

Thanks @warkolm for your response....

This logical operation will cause same IO Operations on Disk level too ..

OR

From HEAP to Disk its SYNC IO ?

In our VM / OS .. ASYNC IO is enabled...

DavidTurner · September 22, 2021, 12:28pm

Yeah there's nothing about IO in those docs, which are also very old and may not be accurate any more. Elasticsearch doesn't use asynchronous IO.

tusharnemade · September 22, 2021, 12:34pm

Okay @DavidTurner

For my Understanding could you please correct me ...

100 of Documents are getting pushed to Elasticsearch per seconds ..

Are these documents processed into INDEX Named "my-index" serially one by one ?

If Processing of Index in HEAP is done serially ... How does translogs are getting updated SYNC wise only ?
Does writing to ACTUAL File on OS mount point from HEAP memory is done Serially SYNC wise or ASYNC wise ?

OR It's completely depend on OS IO Operations from JAVA HEAP to Index file on disk ?

Could you please spare some time and respond to me for my understanding ...

It will help me a lot for configuration and to proceed further in my setup ...

DavidTurner · September 22, 2021, 5:50pm

It depends. Each bulk is processed one-by-one on each shard, but parallel bulks are processed in parallel. This has nothing to do with async vs sync IO.

The translog is written using synchronous IO.

All files (including translogs) are written using synchronous IO.

system · October 20, 2021, 5:51pm

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
ES default - async or sync Elasticsearch	4	9473	July 6, 2017
Confuse about 'index.translog.durability' Elasticsearch	7	1538	July 22, 2020
Confusion of elasticsearch Elasticsearch	3	412	July 5, 2017
GET Consistency (and Quorum) in ElasticSearch Elasticsearch	4	8111	July 6, 2017
How to work index refresh? & asyncronous replication setting Elasticsearch	7	523	July 6, 2017

Elasticsearch System Parameter for ASYNC IO

Related topics