Yesterday I upgraded my cluster from 1.6 to 2.4. For some reason, indexing performance is awful.
My pipeline is {bro nms logs / syslogs / windows logs} -> redis -> logstash -> elasticsearch.
The only thing I upgraded was elasticsearch. I upgraded logstash to the 2.x branch months ago, and everything has been fine.
With ES 1.6, I could get an indexing rate of about 13k events per second, running on 8 data nodes on dedicated hardware, each with a dedicated 1 TB disk for data, 32gb of ram (16 given to ES), and 4 processor cores, and 3 master nodes running in VMs. In the past, ES was CPU bound, with my CPUs maxed out.
On ES 2.4, I can only get about 800 events per second, and it looks like it's disk bound now, with the IO lights lit up solid, and the CPUs only running about 20%.
I've tried playing around with the logstash output settings, adjusting the flush_size and number of worker threads, which got me up to 800 events per second, from only 300 events per second. Beyond that, I can't seem to get any more throughput. If I can't fix this, I'm going to have to blow away my cluster and reinstall 1.6 and restore my backup.
Anybody got any ideas?