Poor Update Performance Despite Refresh Interval Compromise

Nariman_Haghighi · April 23, 2014, 2:01pm

Running a 2-node cluster, we're experiencing less than ideal update times
even after adjusting the refresh interval.

Settings are:
"number_of_replicas":"1","number_of_shards":"5","refresh_interval":"5s"

The two VMs are 4 cores, 7 GB of ram, and the following are response times
reported (on avg - over a 2-3 month duration):

Imported 2475 documents in 7107 milliseconds
Imported 2475 documents in 4862 milliseconds
Imported 2475 documents in 6015 milliseconds
Imported 2475 documents in 5991 milliseconds

My understanding of the reported times (using Elasticsearch.NET's
IBulkRequest Took) is that they don't involve the network delays associated
with getting the request to ES, just the server processing time:
https://github.com/elasticsearch/elasticsearch-net/issues/453

Are these times considered below/average/above given your experience? Can
anything else be done to improve indexing performance here?

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/44863f41-1eac-4294-a710-c00d7ea8e1bc%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

nik9000 · April 23, 2014, 2:13pm

On Wed, Apr 23, 2014 at 10:01 AM, Nariman Haghighi auspicious@gmail.comwrote:

Running a 2-node cluster, we're experiencing less than ideal update times

even after adjusting the refresh interval.

Settings are:
"number_of_replicas":"1","number_of_shards":"5","refresh_interval":"5s"

The two VMs are 4 cores, 7 GB of ram, and the following are response times
reported (on avg - over a 2-3 month duration):

Imported 2475 documents in 7107 milliseconds
Imported 2475 documents in 4862 milliseconds
Imported 2475 documents in 6015 milliseconds
Imported 2475 documents in 5991 milliseconds

My understanding of the reported times (using Elasticsearch.NET's
IBulkRequest Took) is that they don't involve the network delays associated
with getting the request to ES, just the server processing time:
Support for GZIP on PUT/POST · Issue #453 · elastic/elasticsearch-net · GitHub

Are these times considered below/average/above given your experience? Can
anything else be done to improve indexing performance here?

It really depends on the size of the documents and how cpu heavy the
analysis is. I feel pretty good when I can get 5,000 per second across 16
severs with 96GB of ram and 12 (couple year old) cpus. But my documents
are generally a couple hundred KB and range up into tens of MB. OTOH, I
can't overwhelm the servers because they are still performing searching
during this time so I try to keep the bump in cpu load due to this kind of
bulk indexing around 25%.

The standard advice is to shut off the refresh interval during bulk loads
if you can get away with it and make sure you are doing them across many
threads.

Nik

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAPmjWd3ya0zWwSaFk5%2B4Az4BuUqYyqVht-0RhaQn303z23Or%2BQ%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

Topic		Replies	Views
Refresh latency Elasticsearch	7	801	November 8, 2017
Indexing performance Elasticsearch	6	367	July 6, 2017
INDEX Performance Elasticsearch	15	696	July 19, 2018
Elasticsearch Query Performance while reading and writting in parallel Elasticsearch	3	647	July 6, 2017
Slower Query Response Times - Intermittent Elasticsearch	7	476	April 30, 2021

Poor Update Performance Despite Refresh Interval Compromise

Related topics