Elasticsearch IOPS

tusharnemade · February 27, 2021, 5:47am

Hello Team

I am having Elasticsearch version 5.6.3 running on RHEL 7.4. This Cluster is of 10 nodes.

Hardware : 8 CPU and 32 GB RAM.
Hard Disk : HDD { CEPH Storage }

Elastic Data Paths : Multiple { 7 Different mount points }.

We are logging logs data into this ES cluster.

When we are performing IO Test using "fio" tool , we are getting throughput of sequential write to multiple mount points as below
WRITE: bw=20.3MiB/s (21.3MB/s), 20.3MiB/s-20.3MiB/s (21.3MB/s-21.3MB/s), io=24.0GiB (25.8GB), run=1210725-1210725msec

When indexing is being performed , we are getting speed of 30-40 records are documented in indices per 10 seconds. Its very slow.

I would like to understand why is the difference so huge when its comes to realistic data pushing.

Reference : fio job config file :

[global]
bs=4K
iodepth=64
direct=1
ioengine=mmap
group_reporting
numjobs=4
name=es_io_chk
rw=write
nrfiles=4
randrepeat=1
gtod_reduce=1
size=1G

[job1]
directory=/usrdata
filename=test_io

[job2]
directory=/usrdata2
filename=test_io

[job3]
directory=/usrdata3
filename=test_io

[job4]
directory=/usrdata4
filename=test_io

[job5]
directory=/usrdata5
filename=test_io

[job6]
directory=/usrdata6
filename=test_io

Could you please help me in understanding this behaviour.

Thanks

warkolm · February 27, 2021, 5:57am

That's been EOL for quite some time, please upgrade as a matter of urgency.

That's why, get rid of it and your issues will likely go away

tusharnemade · February 27, 2021, 6:06am

Hi Mark

Yes, I understand ES 5.6.3 is very old one. We are testing with new ES 7.x version.

Ideally I know HDD are bad performer , at same time I need to prove that its IO is too bad for our system / application.

When IO Testing was done , its results were surprising for me as 20Mibs throughput is seen.
While Indexing is dead slow.

Could you help me with "fio" what parameters should I pass in, to get like Elasticsearch IO behaviour and true throughput can be captured.

Why is Elasticsearch so slow in documenting data 30-40 docs per 10 secs.

warkolm · February 27, 2021, 6:14am

We don't support Elasticsearch with distributed filesystems, so it's unlikely you will find anything to fix your issue other than moving away from ceph sorry.

tusharnemade · February 27, 2021, 6:16am

Okay. We have option of specifying multiple data path for ES cluster in each node.

what would be the purpose of this ?

Do you suggest that we should have single mount point { /usrdata -- of 10 TB } on each node of elasticsearch to server documenting ?

warkolm · February 27, 2021, 6:23am

The issue is not multiple data paths, the issue is ceph is distributed.
You are then running a high performance distributed system on top of a distributed filesystem. It's not really designed for high performance.

tusharnemade · February 27, 2021, 6:33am

Okay , Understood.

Could you please confirm Elasticsearch performs write operations in SEQUENTIAL manner and using mmap file system to store shard data ...?

Is there fio command options to be used for benchmarking IO performance , equivalent to ES oeprations ?

tusharnemade · February 27, 2021, 6:34am

So Elasticsearch on Distributed file system is not supported , then it would not be supported on NETAPP storage too ?

Christian_Dahlqvist · February 27, 2021, 10:02am

Elasticsearch does generally not write very large segments sequentially, so your fio load is likely not very representative. I would recommend trying with a random read/write load instead as I would expect that to better match an Elasticsearch load pattern.

I recall having seen issues with GlusterFS and NFS in the past but am not sure whether these issues persists.

tusharnemade · March 1, 2021, 3:44am

Okay, Understood.

Could you please elaborate Large segments means ==> Single Document of huge mapping and complex structure ? OR in parallel pushing multiple documents to different indices ?

In our case , out mapping is not that huge and majorly consist of text. We do push multiple docs to multiple indices in parallel at a time from different applications of ours.

Also from Elasticsearch documentation , I learned that ES writes data in sequential manner. Hence I am doing sequential testing .. using MMAP as ioengine.

Also we have observed , CEPH storage having SSD are 50 times faster then CEPH HDD... both being on CEPH are distributed storages. Hence not able to reach conclusion by would HDD have such a slow IO when fio shows its IOPS = 3000-3500 and having bandwidth of 20Mibs ....

Christian_Dahlqvist · March 1, 2021, 6:10am

I find that dip benchmarks with large sequential reads and writes are not representative of normal Elasticsearch load. If you believe differently I would recommend setting up a cluster with fast local storage and run a test with representative load and profile io access patterns.

system · March 29, 2021, 6:10am

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
CEPH Storage For Elasticsearch Elasticsearch	5	3694	September 8, 2021
Elastic cluster slow down Elasticsearch	6	562	October 18, 2019
Updating ELK causes triple write IOPS Elasticsearch	8	1412	July 16, 2018
Elasticsearch IO characteristics on SSD Vs HDD Elasticsearch	2	1449	March 9, 2020
Performance problem because of read IOPS increase Elasticsearch	5	752	March 12, 2024

Elasticsearch IOPS

Related topics