Cluster timeouts during various requests

Robert_Vojcik_1 · November 7, 2017, 1:51pm

Elasticsearch version (bin/elasticsearch --version):

5.6.3

Plugins installed: []
x-pack
mapper-size

JVM version (java -version):
Oracle JRE 1.8.0-102

OS version (uname -a if on a Unix-like system):
Debian Wheezy
Linux esdb2 4.4.82-ls-2 #2 SMP Tue Aug 22 16:46:15 CEST 2017 x86_64 GNU/Linux

Description of the problem including expected versus actual behavior:
Many requests timeout.
For example

mapping
allocation
snapshot create/restore

Provide logs (if relevant):

[2017-11-07T13:45:40,315][DEBUG][o.e.a.a.i.m.p.TransportPutMappingAction] [esdb2] failed to put mappings on indices [[[lslogbeat-2017.11.07/ybyWgKnaRd23NnBxbj7IOg]]], type [1510057956]
org.elasticsearch.cluster.metadata.ProcessClusterEventTimeoutException: failed to process cluster event (put-mapping) within 30s
    at org.elasticsearch.cluster.service.ClusterService$ClusterServiceTaskBatcher.lambda$null$0(ClusterService.java:255) ~[elasticsearch-5.6.3.jar:5.6.3]
    at java.util.ArrayList.forEach(ArrayList.java:1249) ~[?:1.8.0_102]
    at org.elasticsearch.cluster.service.ClusterService$ClusterServiceTaskBatcher.lambda$onTimeout$1(ClusterService.java:254) ~[elasticsearch-5.6.3.jar:5.6.3]
    at org.elasticsearch.common.util.concurrent.ThreadContext$ContextPreservingRunnable.run(ThreadContext.java:569) [elasticsearch-5.6.3.jar:5.6.3]
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) [?:1.8.0_102]
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) [?:1.8.0_102]
    at java.lang.Thread.run(Thread.java:745) [?:1.8.0_102]
[2017-11-07T13:45:40,315][DEBUG][o.e.a.a.i.m.p.TransportPutMappingAction] [esdb2] failed to put mappings on indices [[[lslogbeat-2017.11.07/ybyWgKnaRd23NnBxbj7IOg]]], type [1510057956]
org.elasticsearch.cluster.metadata.ProcessClusterEventTimeoutException: failed to process cluster event (put-mapping) within 30s
    at org.elasticsearch.cluster.service.ClusterService$ClusterServiceTaskBatcher.lambda$null$0(ClusterService.java:255) ~[elasticsearch-5.6.3.jar:5.6.3]
    at java.util.ArrayList.forEach(ArrayList.java:1249) ~[?:1.8.0_102]
    at org.elasticsearch.cluster.service.ClusterService$ClusterServiceTaskBatcher.lambda$onTimeout$1(ClusterService.java:254) ~[elasticsearch-5.6.3.jar:5.6.3]
    at org.elasticsearch.common.util.concurrent.ThreadContext$ContextPreservingRunnable.run(ThreadContext.java:569) [elasticsearch-5.6.3.jar:5.6.3]
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) [?:1.8.0_102]
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) [?:1.8.0_102]
    at java.lang.Thread.run(Thread.java:745) [?:1.8.0_102]
[2017-11-07T13:45:40,315][DEBUG][o.e.a.a.i.m.p.TransportPutMappingAction] [esdb2] failed to put mappings on indices [[[lslogbeat-2017.11.07/ybyWgKnaRd23NnBxbj7IOg]]], type [1510057956]
org.elasticsearch.cluster.metadata.ProcessClusterEventTimeoutException: failed to process cluster event (put-mapping) within 30s
    at org.elasticsearch.cluster.service.ClusterService$ClusterServiceTaskBatcher.lambda$null$0(ClusterService.java:255) ~[elasticsearch-5.6.3.jar:5.6.3]
    at java.util.ArrayList.forEach(ArrayList.java:1249) ~[?:1.8.0_102]
    at org.elasticsearch.cluster.service.ClusterService$ClusterServiceTaskBatcher.lambda$onTimeout$1(ClusterService.java:254) ~[elasticsearch-5.6.3.jar:5.6.3]
    at org.elasticsearch.common.util.concurrent.ThreadContext$ContextPreservingRunnable.run(ThreadContext.java:569) [elasticsearch-5.6.3.jar:5.6.3]
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) [?:1.8.0_102]
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) [?:1.8.0_102]
    at java.lang.Thread.run(Thread.java:745) [?:1.8.0_102]
[2017-11-07T13:45:40,315][DEBUG][o.e.a.a.i.m.p.TransportPutMappingAction] [esdb2] failed to put mappings on indices [[[lslogbeat-2017.11.07/ybyWgKnaRd23NnBxbj7IOg]]], type [1510057956]
org.elasticsearch.cluster.metadata.ProcessClusterEventTimeoutException: failed to process cluster event (put-mapping) within 30s
    at org.elasticsearch.cluster.service.ClusterService$ClusterServiceTaskBatcher.lambda$null$0(ClusterService.java:255) ~[elasticsearch-5.6.3.jar:5.6.3]
    at java.util.ArrayList.forEach(ArrayList.java:1249) ~[?:1.8.0_102]
    at org.elasticsearch.cluster.service.ClusterService$ClusterServiceTaskBatcher.lambda$onTimeout$1(ClusterService.java:254) ~[elasticsearch-5.6.3.jar:5.6.3]
    at org.elasticsearch.common.util.concurrent.ThreadContext$ContextPreservingRunnable.run(ThreadContext.java:569) [elasticsearch-5.6.3.jar:5.6.3]
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) [?:1.8.0_102]
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) [?:1.8.0_102]
    at java.lang.Thread.run(Thread.java:745) [?:1.8.0_102]
[2017-11-07T13:45:40,600][WARN ][o.e.c.s.ClusterService   ] [esdb2] cluster state update task [put-mapping[1510057717, 1510057717, 1510057717, 1510057717, 1510057717, 1510057717, 1510057717, 1510057717, 1510057954, 1510057954, 1510057954, 1510057971, 1510058317, 1510058318, 1510057972, 1510057953, 1510057973, 1510057976, 1510057954, 1510057954, 1510058191, 1510058191, 1510058098, 1510058099, 1510058192, 1510058193, 1510058196, 1510058312, 1510058313, 1510058087, 1510058197, 1510058318, 1510058088, 1510058319, 1510057728]] took [2m] above the warn threshold of 30s

Christian_Dahlqvist · November 7, 2017, 1:55pm

What is the full output of the cluster stats API?

Robert_Vojcik_1 · November 7, 2017, 3:24pm

Thank you for reply. Stats output:

   {
    "_nodes": {
        "total": 6,
        "successful": 6,
        "failed": 0
    },
    "cluster_name": "esdb",
    "timestamp": 1510068170859,
    "status": "green",
    "indices": {
        "count": 484,
        "shards": {
            "total": 1934,
            "primaries": 1884,
            "replication": 0.02653927813163482,
            "index": {
                "shards": {
                    "min": 1,
                    "max": 5,
                    "avg": 3.9958677685950414
                },
                "primaries": {
                    "min": 1,
                    "max": 5,
                    "avg": 3.8925619834710745
                },
                "replication": {
                    "min": 0,
                    "max": 3,
                    "avg": 0.10330578512396695
                }
            }
        },
        "docs": {
            "count": 15340225039,
            "deleted": 92405
        },
        "store": {
            "size": "8.3tb",
            "size_in_bytes": 9233638509426,
            "throttle_time": "0s",
            "throttle_time_in_millis": 0
        },
        "fielddata": {
            "memory_size": "5.7mb",
            "memory_size_in_bytes": 6000848,
            "evictions": 0
        },
        "query_cache": {
            "memory_size": "3.9gb",
            "memory_size_in_bytes": 4278054627,
            "total_count": 24743883,
            "hit_count": 7023806,
            "miss_count": 17720077,
            "cache_size": 65992,
            "cache_count": 260237,
            "evictions": 194245
        },
        "completion": {
            "size": "0b",
            "size_in_bytes": 0
        },
        "segments": {
            "count": 44250,
            "memory": "15.7gb",
            "memory_in_bytes": 16954793886,
            "terms_memory": "13gb",
            "terms_memory_in_bytes": 14061932355,
            "stored_fields_memory": "1.9gb",
            "stored_fields_memory_in_bytes": 2063181288,
            "term_vectors_memory": "0b",
            "term_vectors_memory_in_bytes": 0,
            "norms_memory": "116.1kb",
            "norms_memory_in_bytes": 118976,
            "points_memory": "245.3mb",
            "points_memory_in_bytes": 257298999,
            "doc_values_memory": "545.7mb",
            "doc_values_memory_in_bytes": 572262268,
            "index_writer_memory": "17.6mb",
            "index_writer_memory_in_bytes": 18497368,
            "version_map_memory": "798.1kb",
            "version_map_memory_in_bytes": 817334,
            "fixed_bit_set": "1kb",
            "fixed_bit_set_memory_in_bytes": 1080,
            "max_unsafe_auto_id_timestamp": 1510012814788,
            "file_sizes": {}
        }
    },
    "nodes": {
        "count": {
            "total": 6,
            "data": 6,
            "coordinating_only": 0,
            "master": 6,
            "ingest": 6
        },
        "versions": [
            "5.6.3"
        ],
        "os": {
            "available_processors": 208,
            "allocated_processors": 240,
            "names": [
                {
                    "name": "Linux",
                    "count": 6
                }
            ],
            "mem": {
                "total": "375.8gb",
                "total_in_bytes": 403552321536,
                "free": "7.1gb",
                "free_in_bytes": 7722110976,
                "used": "368.6gb",
                "used_in_bytes": 395830210560,
                "free_percent": 2,
                "used_percent": 98
            }
        },
        "process": {
            "cpu": {
                "percent": 8
            },
            "open_file_descriptors": {
                "min": 1155,
                "max": 2558,
                "avg": 1652
            }
        },
        "jvm": {
            "max_uptime": "8.2d",
            "max_uptime_in_millis": 711771755,
            "versions": [
                {
                    "version": "1.8.0_102",
                    "vm_name": "Java HotSpot(TM) 64-Bit Server VM",
                    "vm_version": "25.102-b14",
                    "vm_vendor": "Oracle Corporation",
                    "count": 6
                }
            ],
            "mem": {
                "heap_used": "96.9gb",
                "heap_used_in_bytes": 104082771384,
                "heap_max": "178.7gb",
                "heap_max_in_bytes": 191982469120
            },
            "threads": 2575
        },
        "fs": {
            "total": "27.2tb",
            "total_in_bytes": 29922098896896,
            "free": "17.5tb",
            "free_in_bytes": 19297268899840,
            "available": "17.5tb",
            "available_in_bytes": 19297268899840,
            "spins": "true"
        },
        "plugins": [
            {
                "name": "x-pack",
                "version": "5.6.3",
                "description": "Elasticsearch Expanded Pack Plugin",
                "classname": "org.elasticsearch.xpack.XPackPlugin",
                "has_native_controller": true
            },
            {
                "name": "mapper-size",
                "version": "5.6.3",
                "description": "The Mapper Size plugin allows document to record their uncompressed size at index time.",
                "classname": "org.elasticsearch.plugin.mapper.MapperSizePlugin",
                "has_native_controller": false
            }
        ],
        "network_types": {
            "transport_types": {
                "netty4": 6
            },
            "http_types": {
                "netty4": 6
            }
        }
    }
}

Christian_Dahlqvist · November 7, 2017, 3:30pm

Are you using dynamic mappings? What is the size of your mappings?

Robert_Vojcik_1 · November 7, 2017, 3:53pm

What do you mean by "size" of mappings ?

And yes in most cases. We are using es for storing logs from syslog and logstash.
So there is many dynamic mappings

Robert_Vojcik_1 · November 9, 2017, 10:01am

All timeouts in the cluster was becouse of some buggy beats daemon. Propably some bug which overload cluster with some garbage. I stopped all beats and everything is ok. Syslog logstashes fill data with no problem

So I'm closing this.

system · December 7, 2017, 10:01am

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Failed to put mappings on indices timeouts Elasticsearch	1	1867	July 5, 2017
Certain rest requests time out Elasticsearch	15	479	July 6, 2017
Process Cluster Event Timeout Exception on put-mapping Elasticsearch	12	10273	May 31, 2018
Failed to process cluster event (put-mapping) within 30s Elasticsearch	1	785	November 22, 2018
Process Cluster Event Timeout Exception Elasticsearch	5	2820	July 6, 2017

Cluster timeouts during various requests

Related topics