Random spike in write performance

dna01 · June 22, 2023, 2:33am

I'm noticing random "slowness" when writing. e.g., while most of the time the write operation completed under 20ms, there are occasional write operation that took >1s.
my setup:

6 data nodes, 24G JVM heap. (there are 3 additional master nodes)
there are 2 index with heavy write only, 1 index with heavy read+write, 1 index with heavy write+moderate read.
index shard are spread out evenly across all nodes. 2 index have 12 primaries and 2 have 6 primaries. all have 1 replica.

this random "slowness" happen randomly on all write heavy index. each index roughly receive 5 write operation / second.

after enabling tracing on 'logger.org.elasticsearch.index" I found that when a node is refreshing a shard it "suspend" all write, even when writing on different index. e.g., suppose the node is currently refreshing shards of index_A, and write request coming in for index_B, the write request to index_B doesnt complete/return until that node finishes refreshing index_A.
this behaviour compounded by additional replica shard. since the write request cannot complete until it finishes writing the replica as well. thus if any of the nodes (1 primary and n replica), where the write operation is processed, is currently refreshing, the write request experience the "slowness".

is this the correct behaviour?

any suggestion on how to reduce the the duration or the frequency of this "slowness" issue is greatly appreciated.

system · July 20, 2023, 2:33am

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Poor write performance Elasticsearch	12	752	November 26, 2018
Write slow on elastic cluster Elasticsearch	12	862	July 21, 2021
Write queue continue to rise Elasticsearch	23	3727	February 4, 2020
Indexing performance Elasticsearch	6	367	July 6, 2017
Randomly performance issue Elasticsearch	7	907	February 21, 2020

Random spike in write performance

Related topics