Elasticsearch keeps timeout-ing in docker


(Oli Lalonde) #1

For some reason, it looks like my ElasticSearch server starts timeouting as soon as there are 2 concurrent connections or more (and not doing more than ~3 queries per second).

Here's the error I keep getting:

llib3.py", line 122, in perform_request
    raise ConnectionTimeout('TIMEOUT', str(e), e)
elasticsearch.exceptions.ConnectionTimeout: ConnectionTimeout caused by - ReadTimeoutError(HTTPConnectionPool(host='localhost', port=9200): Read timed out. (read timeout=10))
"""

I'm using docker.elastic.co/elasticsearch/elasticsearch:5.3.0 with following config file

cluster.name: "docker-cluster"
network.host: 0.0.0.0

# minimum_master_nodes need to be explicitly set when bound on a public IP
# set to 1 to allow single node clusters
# Details: https://github.com/elastic/elasticsearch/pull/17288
discovery.zen.minimum_master_nodes: 1

# enable CORS
http.cors.enabled: true
http.cors.allow-origin: "*"

# increase query max length
indices.query.bool.max_clause_count: 100000

# disable auth
xpack.security.enabled: false

If there is only 1 concurrent connection to ES, I'm not getting any error.


(Oli Lalonde) #2

Some logs:

elasticsearch_1  | [2017-04-28T11:16:52,718][WARN ][o.e.m.j.JvmGcMonitorService] [yfHJjxF] [gc][15976] overhead, spent [1.3m] collecting in the last [1.3m]
elasticsearch_1  | [2017-04-28T11:18:25,981][WARN ][o.e.m.j.JvmGcMonitorService] [yfHJjxF] [gc][old][15982][18] duration [1.4m], collections [1]/[1.4m], total [1.4m]/[18.5m], memory [1.9gb]->[1.8gb]/[1.9gb], all_pools {[young] [266.2mb]->[172.7mb]/[266.2mb]}{[survivor] [26.2mb]->[0b]/[33.2mb]}{[old] [1.6gb]->[1.6gb]/[1.6gb]}
elasticsearch_1  | [2017-04-28T11:18:26,126][WARN ][o.e.m.j.JvmGcMonitorService] [yfHJjxF] [gc][15982] overhead, spent [1.4m] collecting in the last [1.4m]
elasticsearch_1  | [2017-04-28T11:18:28,268][WARN ][o.e.m.j.JvmGcMonitorService] [yfHJjxF] [gc][15984] overhead, spent [829ms] collecting in the last [1.1s]

Does spent [1.4m] collecting in the last [1.4m] means it wasn't doing anything useful for the last 1.4 minute?


(Aaron Mildenstein) #3

Yes, that means that the JVM spent nearly 100% of its time performing garbage collection.

If you have garbage collection ([gc]) happening that frequently, for that long (1.4 minutes!), then I'd say you have a memory pressure issue. The default 2G heap is not up to the tasks you're asking of it. You should increase that to as much as 50% of the available system memory, but no higher than 30g, e.g. if the system has 32G of RAM, then you could set the heap to 16G.


(system) #4

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.