A simple cluster, 2 nodes, replica 1. Each node has 1.5Go RAM, 2 cores, SAS
disks.
With Elasticsearch 1.1 some deconnection appears, and some CPU load picks.
Bad usage of logstash (lots of tiny bulk imports, with monthly indices).
Logstash usage was fixed, and Elasticsearch upgraded to 1.3. 1.3.3, then
1.3.4, 10 minutes after.
The CPU usage is now 100% (so one core used), LOTS of file descriptor
opened, and memory usage is growing. RAM is upgraded to 2Go.
Strace show that 5 threads use lots of CPU and 1 thread does 7000 stat()/s.
Elasticsearch Hot thread show lots of FSDirectory.listAll. Disk usage is
low, just a lots of stats.
The shard is set to 9, and logstash opens lots of indices, 2286 shards for
7GB, 37487 files in the indices folder.
In the recovery API, everything is "done" with strange percent score, all
shards have "replica" states.
Now, the load makes heavy waves, slowing the service.
This is just a long migration from different version of Lucene (from ES 1.1
to 1.3), a misconfiguration, a real bug, or am I just doomed?
--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/2e7ad3ba-a1f6-44c8-b9e5-67b0c1ed8bc9%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.