We recently upgraded from ES version 5.6.10 to 7.3.1 and noticed indexing performance seems much worse. We could index at a rate of around 500k op/s, and now we peak at around 200k op/s.
Our use case is that we only reindex clusters that don't receive live traffic, so I don't think this limit really makes sense for us. I wondered what the reasoning behind it was, and if other people have the same issue? I assume if we could increase the thread pool size, we would be able to utilise ~100% of CPU for indexing.
Steps to reproduce :
Reindex a cluster with enough writes per second to get bulk thread rejections
Are you indexing new documents only or do you also update existing documents? What does disk I/O and iowait look like? How many indices and shards are you actively indexing into?
Besides, this limitation is still there in recent versions.
I see that it's reading node.processors setting to configure the thread pool limit...is it possible to override the number of available processors in the elasticsearch config somehow?
Ah found it, the parameter is just processors in 7.3.1
Seems a bit hacky to have to fake the number of available processors to increase this limit, but hey ho.
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.