Elasticsearch, queries and data transfer are very slow and I am getting timeout errors. I found this information in the log files of the data nodes.
Elasticsearch and Kibana version 7.4.2
I have 13 nodes (3 master ,10 data nodes)
Each node has 64 GB of RAM and 5 TB of storage.
Each node jvm.options -Xms30g -Xmx30g
[2021-01-06T09:38:30,253][INFO ][o.e.m.j.JvmGcMonitorService] [data-5] [gc][31546102] overhead, spent [287ms] collecting in the last [1s]
[2021-01-06T09:55:12,720][INFO ][o.e.m.j.JvmGcMonitorService] [data-5] [gc][31547354] overhead, spent [303ms] collecting in the last [1s]
[2021-01-06T10:04:17,502][INFO ][o.e.m.j.JvmGcMonitorService] [data-5] [gc][31546102] overhead, spent [287ms] collecting in the last [1.2s]
[2021-01-06T10:54:54,473][INFO ][o.e.m.j.JvmGcMonitorService] [data-5] [gc][31552295] overhead, spent [340ms] collecting in the last [1s]
[2021-01-06T11:41:58,347][WARN ][o.e.m.j.JvmGcMonitorService] [data-5] [gc][31555117] overhead, spent [521ms] collecting in the last [1s]
It looks like you have an update heavy work load and that you have about 3.7TB of data per data node. Is this correct?
When it comes to indexing and querying the limiting factor in a cluster is often the performance of the underlying storage. What type of storage are you using? Locally attached SSDs? Have you looked at how your storage is performing and how much iowait you are seeing, e.g. using the iostat utility?
Another thing to verify is that you are using compressed pointers. It looks like you are but this should be printed in the Elasticsearch logs on startup.
I get a timeout error on the Kibana stack monitoring screen and when transferring data from SQL to Elasticsearch using C # and NEST library.
Unsuccessful () low level call on POST: /ticket/_bulk?refresh=false
# Audit trail of this API call:
- [1] BadResponse: Node: http://192.168.3.71:9200/ Took: 00:01:00.0140872
- [2] MaxTimeoutReached:
# OriginalException: Elasticsearch.Net.ElasticsearchClientException: Maximum timeout reached
while retrying request. Call: Status code unknown from: POST /ticket/_bulk?refresh=false --->
System.Net.WebException: The operation has timed out
at System.Net.HttpWebRequest.GetResponse()
at Elasticsearch.Net.HttpWebRequestConnection.Request[TResponse](RequestData
requestData)
--- End of inner exception stack trace ---
# Request:
<Request stream not captured or already read to completion by serializer. Set
DisableDirectStreaming() on ConnectionSettings to force it to be set on the response.>
# Response:
<Response stream not captured or already read to completion by serializer. Set
DisableDirectStreaming() on ConnectionSettings to force it to be set on the response.>
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.