Reduced bulk performance in 2.3.5 versus 1.3?

Hi all,
Like many others, we're seeing a large difference in bulk indexing performance between our old 1.3 ES and our new 2.3.5 ES cluster. I've followed a bunch of threads, but many don't report back with what their solution was.

We have 10x data nodes, and 3x master nodes.
Data nodes have 24GB ram, 12 allotted to JVM
There's about 1TB of data, documents are around 10-50kb each, depending on type.

We use not analyzed for many fields, but not all.
We use custom routing on our documents, but one bulk request can contain documents with routing for many shards
We use one replica. We have not tried setting replicas to 0, and then upping it after bulk process is done. This causes cluster to go yellow, which alerts our monitors. So we'd like to avoid this if possible.
We do set translog.durability = async and are aware of consequences.
We use SSDs and set the tuning parameters inline with the ES tuning blog.
We run marvel on a separate ES instance.
We have removed all search traffic from this, and indexing does not improve.
This is one large index, and not time bucketed data.
I/O is super low (<10MB/sec)

We have tried varying bulk sizes from 500 -> 3000 documents.
We have tried varying our number of shards between 5 and 200
We've tried to vary the size of the bulk threads.
We've updated refresh_interval to -1 and 15m with no real change
GCs are not thrashing, memory looks good
Other various things from the perf blog were tried (merge throttling = none, refresh = -1, index translog flush = 1GB, etc. )
We do see that our bulk threadpools remain full, and some queueing happens. Seems like one bulk request spawns multiple threads internally. We only have 20 threads submitting bulk requests, and this fills our active threads across all nodes (we have tried reducing this as well).

With 1.3, we saw 12k indexes per sec
With 2.3, we're around 7k/sec

At this point, I'm drawing a blank as to what to try next, other than throwing more hardware at it.

Any ideas what I might be forgetting? What other info would be helpful to diagnose?


1 Like

Hi Chris, you may want to read though my on going thread, I have a very simular issue and have almost identical Use case that you do.

I have no answers yet, I also came across this other page

Logstash Indexer to Elastcsearch Tunnings ( must go faster!)

Here is the link

Under Integrations

Here is the product

The reason is the new translog durability. This means ES executes much more IOPS (esp. fsync calls to the file system) and the operating system must handle this on the block device layer. Maybe you run ES on a virtual machine and the guest/host I/O channel is not configured well? Check file system setup options, network parameter, and I/O elevator settings for maximum throughput.

What server hardware and operating system do you use?

Thanks, will check VM I/O settings

We are indeed using VMs, and running centos 6

Chris, I am also just figuring out the new Logstash 2.3 pipeline and it seems that

The old method of allocating workers and queues are not great for tunning any more

In my config , I thought 2 partitions in kafka was enough for each of the logstash agents I had. But I never got the Logstash to pull nor index faster then 1000 m/s and Logstash never pulled data faster than what got written to Kafka but when I added say 3 times more partitions all of a sudden logstash was pulling more data then I was writing. (The Kafka input auto scales to cover the number of partitions you have for your topic)

Also the number of workers and flush size don't seem to work like the did. You have to also look at the -w and -b options for logstash

when I started to play with these options, Indexing started to increase, though I have not figured out a formula yet

Also I set some of the options that I read about, in some of the tuning guides, I would read up on each of the options before using. I am not sure how important your data is to you. But I am a little more cowboy with mine. (Right now)

#DEBUG enable next line
#set -x
curl -XPUT "$HOSTNAME:$PORT/_cluster/settings" -d '
        "transient" : {
		"" : "none",
		"index.translog.flush_threshold_size" : "2048mb",
		"index.translog.durability" : "async",
		"index.merge.scheduler.max_thread_count" : 20,
		"index.refresh_interval" : "60s"