Elasticsearch data transfer and queries are very slow

RusseL · January 7, 2021, 8:50am

Elasticsearch, queries and data transfer are very slow and I am getting timeout errors. I found this information in the log files of the data nodes.

Elasticsearch and Kibana version 7.4.2
I have 13 nodes (3 master ,10 data nodes)
Each node has 64 GB of RAM and 5 TB of storage.
Each node jvm.options -Xms30g -Xmx30g

[2021-01-06T09:38:30,253][INFO ][o.e.m.j.JvmGcMonitorService] [data-5] [gc][31546102] overhead, spent [287ms] collecting in the last [1s]
[2021-01-06T09:55:12,720][INFO ][o.e.m.j.JvmGcMonitorService] [data-5] [gc][31547354] overhead, spent [303ms] collecting in the last [1s]
[2021-01-06T10:04:17,502][INFO ][o.e.m.j.JvmGcMonitorService] [data-5] [gc][31546102] overhead, spent [287ms] collecting in the last [1.2s]
[2021-01-06T10:54:54,473][INFO ][o.e.m.j.JvmGcMonitorService] [data-5] [gc][31552295] overhead, spent [340ms] collecting in the last [1s]
[2021-01-06T11:41:58,347][WARN ][o.e.m.j.JvmGcMonitorService] [data-5] [gc][31555117] overhead, spent [521ms] collecting in the last [1s]

GET _cluster / stats çıktısı

github.com

Resul-Hasturk/Elasticsearch/blob/main/elastic

{
  "_nodes" : {
    "total" : 13,
    "successful" : 13,
    "failed" : 0
  },
  "cluster_name" : "escls01",
  "cluster_uuid" : "a14P_D4BQmaCeJalR-zU6w",
  "timestamp" : 1610008890405,
  "status" : "green",
  "indices" : {
    "count" : 104,
    "shards" : {
      "total" : 976,
      "primaries" : 488,
      "replication" : 1.0,
      "index" : {
        "shards" : {
          "min" : 2,
          "max" : 80,

This file has been truncated. show original

warkolm · January 11, 2021, 12:29am

Timeout errors from where?

Christian_Dahlqvist · January 11, 2021, 7:36am

It looks like you have an update heavy work load and that you have about 3.7TB of data per data node. Is this correct?

When it comes to indexing and querying the limiting factor in a cluster is often the performance of the underlying storage. What type of storage are you using? Locally attached SSDs? Have you looked at how your storage is performing and how much iowait you are seeing, e.g. using the iostat utility?

Another thing to verify is that you are using compressed pointers. It looks like you are but this should be printed in the Elasticsearch logs on startup.

RusseL · January 11, 2021, 8:56am

hi warkolm,

I get a timeout error on the Kibana stack monitoring screen and when transferring data from SQL to Elasticsearch using C # and NEST library.

Unsuccessful () low level call on POST: /ticket/_bulk?refresh=false
# Audit trail of this API call:
- [1] BadResponse: Node: http://192.168.3.71:9200/ Took: 00:01:00.0140872
- [2] MaxTimeoutReached:
# OriginalException: Elasticsearch.Net.ElasticsearchClientException: Maximum timeout reached 
while retrying request. Call: Status code unknown from: POST /ticket/_bulk?refresh=false ---> 
System.Net.WebException: The operation has timed out
  at System.Net.HttpWebRequest.GetResponse()
  at Elasticsearch.Net.HttpWebRequestConnection.Request[TResponse](RequestData 
requestData)
  --- End of inner exception stack trace ---
# Request:
<Request stream not captured or already read to completion by serializer. Set 
 DisableDirectStreaming() on ConnectionSettings to force it to be set on the response.>
# Response:
<Response stream not captured or already read to completion by serializer. Set 
DisableDirectStreaming() on ConnectionSettings to force it to be set on the response.>

RusseL · January 11, 2021, 9:19am

hi @Christian_Dahlqvist

It looks like you have an update heavy work load and that you have about 3.7TB of data per data node. Is this correct?

Yes true.

SSD and SATA mixture.

I didn't check iostat utility.Because idk how can i use it.

Elasticsearch Logs only showing this message :

[2020-12-23T16:08:01,003][ERROR][o.e.x.m.c.c.ClusterStatsCollector] [master-3] collector [cluster_stats] timed out when collecting data

maybe it helps

GET _cluster/nodes/hot_threads output:

github.com

Resul-Hasturk/Elasticsearch/blob/main/hot_thread

::: {data-9}{2nEK_Bt1R5yfObiZTBha2Q}{i1xHvs2dS02GO1hkvDIFQg}{192.168.3.57}{192.168.3.57:9300}{dil}{ml.machine_memory=270435311616, ml.max_open_jobs=20, xpack.installed=true, disk_type=ssd}
   Hot threads at 2021-01-11T08:59:09.543Z, interval=500ms, busiestThreads=3, ignoreIdleThreads=true:
   
   77.2% (385.8ms out of 500ms) cpu usage by thread 'elasticsearch[data-9][[pts-2021.1][6]: Lucene Merge Thread #4813]'
     2/10 snapshots sharing following 13 elements
       app//org.apache.lucene.index.MappingMultiPostingsEnum.nextDoc(MappingMultiPostingsEnum.java:103)
       app//org.apache.lucene.codecs.PushPostingsWriterBase.writeTerm(PushPostingsWriterBase.java:135)
       app//org.apache.lucene.codecs.blocktree.BlockTreeTermsWriter$TermsWriter.write(BlockTreeTermsWriter.java:865)
       app//org.apache.lucene.codecs.blocktree.BlockTreeTermsWriter.write(BlockTreeTermsWriter.java:344)
       app//org.apache.lucene.codecs.FieldsConsumer.merge(FieldsConsumer.java:105)
       app//org.apache.lucene.codecs.perfield.PerFieldPostingsFormat$FieldsWriter.merge(PerFieldPostingsFormat.java:169)
       app//org.apache.lucene.index.SegmentMerger.mergeTerms(SegmentMerger.java:245)
       app//org.apache.lucene.index.SegmentMerger.merge(SegmentMerger.java:140)
       app//org.apache.lucene.index.IndexWriter.mergeMiddle(IndexWriter.java:4462)
       app//org.apache.lucene.index.IndexWriter.merge(IndexWriter.java:4056)
       app//org.apache.lucene.index.ConcurrentMergeScheduler.doMerge(ConcurrentMergeScheduler.java:625)
       app//org.elasticsearch.index.engine.ElasticsearchConcurrentMergeScheduler.doMerge(ElasticsearchConcurrentMergeScheduler.java:101)
       app//org.apache.lucene.index.ConcurrentMergeScheduler$MergeThread.run(ConcurrentMergeScheduler.java:662)
     3/10 snapshots sharing following 13 elements
       app//org.apache.lucene.codecs.blocktree.BlockTreeTermsWriter$TermsWriter.writeBlocks(BlockTreeTermsWriter.java:603)

This file has been truncated. show original

system · February 8, 2021, 9:19am

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Elasticsearch GC timeout on data node Elasticsearch	2	393	August 10, 2021
[Solved] Elasticseach stuck at 213 documents and losing data Elasticsearch	3	479	November 6, 2018
Improving Speed to Query Millions of Small Documents Elasticsearch	6	1653	October 21, 2019
Understanding Elasticsearch performance and correlation with hardware specs Elasticsearch	4	702	December 19, 2016
Kibana Timeouts/Shard failed errors Kibana	12	785	August 6, 2019

Elasticsearch data transfer and queries are very slow

Related topics