Hi I'm seeing a lot of put mapping timeouts on my master server and have a few questions related to this. The exact log line is:
Sep 20 16:00:14.533921 ops-flume-es-master-useast1-2-i-ba0a66ac ElasticSearch: [DEBUG][action.admin.indices.mapping.put] [ops-flume-es-master-useast1-2-i-ba0a66ac-flume-elasticsearch-production_vpc-useast1] failed to put mappings on indices [[flume-2016-09-20]], type [log] Sep 20 16:00:14.534708 ops-flume-es-master-useast1-2-i-ba0a66ac ProcessClusterEventTimeoutException[failed to process cluster event (put-mapping [log]) within 30s] Sep 20 16:00:14.535102 ops-flume-es-master-useast1-2-i-ba0a66ac at org.elasticsearch.cluster.service.InternalClusterService$2$1.run(InternalClusterService.java:349) Sep 20 16:00:14.535381 ops-flume-es-master-useast1-2-i-ba0a66ac at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) Sep 20 16:00:14.535752 at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) Sep 20 16:00:14.536012 ops-flume-es-master-useast1-2-i-ba0a66ac at java.lang.Thread.run(Thread.java:745)
I presume this may be related to pressure. The cluster has 77 indices, 1334 shards, and ~27B docs. We're running ES 2.4.0 and jdk1.8.
A few questions:
- Is the timeout related to the master it self timing out or nodes not responding to the master in time?
- Depending on the above what would help? Adding additional nodes to spread the load or scaling up the master(s)?