For some reason, it looks like my ElasticSearch server starts timeouting as soon as there are 2 concurrent connections or more (and not doing more than ~3 queries per second).
Here's the error I keep getting:
llib3.py", line 122, in perform_request
raise ConnectionTimeout('TIMEOUT', str(e), e)
elasticsearch.exceptions.ConnectionTimeout: ConnectionTimeout caused by - ReadTimeoutError(HTTPConnectionPool(host='localhost', port=9200): Read timed out. (read timeout=10))
"""
I'm using docker.elastic.co/elasticsearch/elasticsearch:5.3.0 with following config file
cluster.name: "docker-cluster"
network.host: 0.0.0.0
# minimum_master_nodes need to be explicitly set when bound on a public IP
# set to 1 to allow single node clusters
# Details: https://github.com/elastic/elasticsearch/pull/17288
discovery.zen.minimum_master_nodes: 1
# enable CORS
http.cors.enabled: true
http.cors.allow-origin: "*"
# increase query max length
indices.query.bool.max_clause_count: 100000
# disable auth
xpack.security.enabled: false
If there is only 1 concurrent connection to ES, I'm not getting any error.
Yes, that means that the JVM spent nearly 100% of its time performing garbage collection.
If you have garbage collection ([gc]) happening that frequently, for that long (1.4 minutes!), then I'd say you have a memory pressure issue. The default 2G heap is not up to the tasks you're asking of it. You should increase that to as much as 50% of the available system memory, but no higher than 30g, e.g. if the system has 32G of RAM, then you could set the heap to 16G.
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.