Bad performance and crashes with Elasticsearch 5.1


(Vitaly) #1

We're running ELK 5.1.1; Elasticsearch runs on a standalone EC2 instance with 4CPUs/16GB RAM.
We're indexing about 20GB daily, 40 indexes per day, so with 30-40 days retention we have ~800GB data, about 1800 indexes.
This is staging environment, in our production we're running Elastic 2.x, and server with the same specs works nice with >300GB/day, i.e. about 15 times more.
As far as I can see, our traffic is very low for this server; our baseline is about 3% iowait and about 15% user CPU load.
We have two issues:

  • all searches are slow. For instance, just basic discovery for the last week takes about 40 seconds. During that I see user CPU usage close to 100%, iowait stays low - up to 5%. Many queries are aborted by circuit breakers, in this case Elasticsearch stops indexing.
  • from time to time it stops indexing

"jps -l -m -v" output:
16460 sun.tools.jps.Jps -l -m -v -Dapplication.home=/usr/lib/jvm/jdk1.8.0_101 -Xms8m
1916 org.elasticsearch.bootstrap.Elasticsearch -d -p /var/run/elasticsearch/elasticsearch.pid -Edefault.path.logs=/var/log/elasticsearch -Edefault.path.data=/elasticsearch/data -Edefault.path.conf=/etc/elasticsearch -Xms8g -Xmx8g -XX:+UseConcMarkSweepGC -XX:CMSInitiatingOccupancyFraction=75 -XX:+UseCMSInitiatingOccupancyOnly -XX:+DisableExplicitGC -XX:+AlwaysPreTouch -Djava.awt.headless=true -Dfile.encoding=UTF-8 -Djna.nosys=true -Dio.netty.noUnsafe=true -Dio.netty.noKeySetOptimization=true -Dlog4j.shutdownHookEnabled=false -Dlog4j2.disable.jmx=true -Dlog4j.skipJansi=true -XX:+HeapDumpOnOutOfMemoryError -Djna.tmpdir=/elasticsearch/tmp -Des.path.home=/usr/share/elasticsearch

Any ideas?

TIA, Vitaly


(Jason Tedor) #2

All versions less than 5.2.2 are impacted by three issues:

From your description, I think you are most likely heavily impacted by the first issue.

I think you should upgrade.


(Vitaly) #3

Jason, many thanks!
We upgraded to 5.3 last Thu, and so far system seems much better - no crashes during a week.
Vitaly


(Jason Tedor) #4

You're welcome!


(system) #5

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.