Elasticsearch issue

Hello there,

I am using ELK 7.9.2 and I see following issue with elasticsearch and kibana.

**Elasticsearch**

[2020-12-09T15:13:39,596][WARN ][o.e.c.r.a.AllocationService] [master] [.kibana_1][0] marking unavailable shards as stale: [dGFQCi6tQRKTO2hQrQ6m9A]
[2020-12-09T15:13:40,516][WARN ][o.e.c.r.a.AllocationService] [master] [.kibana_task_manager_1][0] marking unavailable shards as stale: [LqPX3TXQTHeUwhc4eKs2pw]
[2020-12-09T15:13:40,516][WARN ][o.e.c.r.a.AllocationService] [master] [.apm-agent-configuration][0] marking unavailable shards as stale: [R_6uJBYeT1iLtP6SZ6_OLg]
[2020-12-09T15:18:14,562][INFO ][o.e.c.m.MetadataIndexTemplateService] [master] adding template [.management-beats] for index patterns [.management-beats]
[2020-12-09T15:18:20,053][INFO ][o.e.m.j.JvmGcMonitorService] [master] [gc][435] overhead, spent [303ms] collecting in the last [1s]
[2020-12-09T15:28:29,020][WARN ][o.e.c.s.MasterService    ] [master] took [10.2m], which is over [10s], to compute cluster state update for [create-index-template [.management-beats], cause [api]]
[2020-12-09T15:28:29,140][WARN ][r.suppressed             ] [master] path: /.apm-agent-configuration/_mapping, params: {index=.apm-agent-configuration}
org.elasticsearch.cluster.metadata.ProcessClusterEventTimeoutException: failed to process cluster event (put-mapping [.apm-agent-configuration/kCLeibGPTRq_6yq-MuUcuw]) within 30s
        at org.elasticsearch.cluster.service.MasterService$Batcher.lambda$onTimeout$0(MasterService.java:143) [elasticsearch-7.9.2.jar:7.9.2]
        at java.util.ArrayList.forEach(ArrayList.java:1511) [?:?]
        at org.elasticsearch.cluster.service.MasterService$Batcher.lambda$onTimeout$1(MasterService.java:142) [elasticsearch-7.9.2.jar:7.9.2]
        at org.elasticsearch.common.util.concurrent.ThreadContext$ContextPreservingRunnable.run(ThreadContext.java:678) [elasticsearch-7.9.2.jar:7.9.2]
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1130) [?:?]
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:630) [?:?]
        at java.lang.Thread.run(Thread.java:832) [?:?]
[2020-12-09T15:28:29,116][WARN ][r.suppressed             ] [master] path: /.apm-agent-configuration/_mapping, params: {index=.apm-agent-configuration}
org.elasticsearch.cluster.metadata.ProcessClusterEventTimeoutException: failed to process cluster event (put-mapping [.apm-agent-configuration/kCLeibGPTRq_6yq-MuUcuw]) within 30s
        at org.elasticsearch.cluster.service.MasterService$Batcher.lambda$onTimeout$0(MasterService.java:143) [elasticsearch-7.9.2.jar:7.9.2]
        at java.util.ArrayList.forEach(ArrayList.java:1511) [?:?]
        at org.elasticsearch.cluster.service.MasterService$Batcher.lambda$onTimeout$1(MasterService.java:142) [elasticsearch-7.9.2.jar:7.9.2]
        at org.elasticsearch.common.util.concurrent.ThreadContext$ContextPreservingRunnable.run(ThreadContext.java:678) [elasticsearch-7.9.2.jar:7.9.2]
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1130) [?:?]
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:630) [?:?]
        at java.lang.Thread.run(Thread.java:832) [?:?]
[2020-12-09T15:28:29,088][WARN ][r.suppressed             ] [master] path: /.apm-custom-link/_mapping, params: {index=.apm-custom-link}
org.elasticsearch.cluster.metadata.ProcessClusterEventTimeoutException: failed to process cluster event (put-mapping [.apm-custom-link/kgZfWqBCRMSIcy0o-apnjg]) within 30s
        at org.elasticsearch.cluster.service.MasterService$Batcher.lambda$onTimeout$0(MasterService.java:143) [elasticsearch-7.9.2.jar:7.9.2]
        at java.util.ArrayList.forEach(ArrayList.java:1511) [?:?]
        at org.elasticsearch.cluster.service.MasterService$Batcher.lambda$onTimeout$1(MasterService.java:142) [elasticsearch-7.9.2.jar:7.9.2]
        at org.elasticsearch.common.util.concurrent.ThreadContext$ContextPreservingRunnable.run(ThreadContext.java:678) [elasticsearch-7.9.2.jar:7.9.2]
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1130) [?:?]
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:630) [?:?]
        at java.lang.Thread.run(Thread.java:832) [?:?]
[2020-12-09T15:28:29,181][WARN ][r.suppressed             ] [master] path: /.apm-custom-link/_mapping, params: {index=.apm-custom-link}
org.elasticsearch.cluster.metadata.ProcessClusterEventTimeoutException: failed to process cluster event (put-mapping [.apm-custom-link/kgZfWqBCRMSIcy0o-apnjg]) within 30s
        at org.elasticsearch.cluster.service.MasterService$Batcher.lambda$onTimeout$0(MasterService.java:143) [elasticsearch-7.9.2.jar:7.9.2]
        at java.util.ArrayList.forEach(ArrayList.java:1511) [?:?]
        at org.elasticsearch.cluster.service.MasterService$Batcher.lambda$onTimeout$1(MasterService.java:142) [elasticsearch-7.9.2.jar:7.9.2]
        at org.elasticsearch.common.util.concurrent.ThreadContext$ContextPreservingRunnable.run(ThreadContext.java:678) [elasticsearch-7.9.2.jar:7.9.2]
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1130) [?:?]
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:630) [?:?]
        at java.lang.Thread.run(Thread.java:832) [?:?]

**Kibana**

 log   [20:22:56.766] [error][plugins][taskManager][taskManager] Failed to poll for work: Error: Request Timeout after 30000ms
  log   [20:22:56.851] [warning][apm][plugins] Could not create index: '.apm-custom-link'. Retrying...
  log   [20:22:56.852] [warning][apm][plugins] { Error: Request Timeout after 30000ms
    at C:\ELK7.9.2\kibana-7.9.2-windows-x86_64\node_modules\elasticsearch\src\lib\transport.js:397:9
    at Timeout.<anonymous> (C:\ELK7.9.2\kibana-7.9.2-windows-x86_64\node_modules\elasticsearch\src\lib\transport.js:429:7)
    at ontimeout (timers.js:436:11)
    at tryOnTimeout (timers.js:300:5)
    at listOnTimeout (timers.js:263:5)
    at Timer.processTimers (timers.js:223:10)
  status: undefined,
  displayName: 'RequestTimeout',
  message: 'Request Timeout after 30000ms',
  body: undefined,
  attemptNumber: 1,
  retriesLeft: 10 }
  log   [20:22:56.864] [warning][apm][plugins] Could not create index: '.apm-agent-configuration'. Retrying...
  log   [20:22:56.864] [warning][apm][plugins] { Error: Request Timeout after 30000ms
    at C:\ELK7.9.2\kibana-7.9.2-windows-x86_64\node_modules\elasticsearch\src\lib\transport.js:397:9
    at Timeout.<anonymous> (C:\ELK7.9.2\kibana-7.9.2-windows-x86_64\node_modules\elasticsearch\src\lib\transport.js:429:7)
    at ontimeout (timers.js:436:11)
    at tryOnTimeout (timers.js:300:5)
    at listOnTimeout (timers.js:263:5)
    at Timer.processTimers (timers.js:223:10)
  status: undefined,
  displayName: 'RequestTimeout',
  message: 'Request Timeout after 30000ms',
  body: undefined,
  attemptNumber: 1,
  retriesLeft: 10 }
 error  [20:22:59.399] [warning][process] UnhandledPromiseRejectionWarning: Error: Request Timeout after 30000ms
    at C:\ELK7.9.2\kibana-7.9.2-windows-x86_64\node_modules\elasticsearch\src\lib\transport.js:397:9
    at Timeout.<anonymous> (C:\ELK7.9.2\kibana-7.9.2-windows-x86_64\node_modules\elasticsearch\src\lib\transport.js:429:7)

I am new to Elasticsearch and having problems understanding this issue. any help would be appreciated.

Thanks.

What is the specification of your cluster? What hardware are you using? How much resources are allocated to Elasticsearch? What is the full output of the cluster stats API?

here is cluster stats API.


{
  "_nodes" : {
    "total" : 3,
    "successful" : 3,
    "failed" : 0
  },
  "cluster_name" : "cbsa",
  "cluster_uuid" : "gRVm3XAKR3ayC2fJJKSduA",
  "timestamp" : 1607547003570,
  "status" : "green",
  "indices" : {
    "count" : 17,
    "shards" : {
      "total" : 34,
      "primaries" : 17,
      "replication" : 1.0,
      "index" : {
        "shards" : {
          "min" : 2,
          "max" : 2,
          "avg" : 2.0
        },
        "primaries" : {
          "min" : 1,
          "max" : 1,
          "avg" : 1.0
        },
        "replication" : {
          "min" : 1.0,
          "max" : 1.0,
          "avg" : 1.0
        }
      }
    },
    "docs" : {
      "count" : 72844906,
      "deleted" : 19757
    },
    "store" : {
      "size_in_bytes" : 50622758592,
      "reserved_in_bytes" : 0
    },
    "fielddata" : {
      "memory_size_in_bytes" : 0,
      "evictions" : 0
    },
    "query_cache" : {
      "memory_size_in_bytes" : 0,
      "total_count" : 37,
      "hit_count" : 0,
      "miss_count" : 37,
      "cache_size" : 0,
      "cache_count" : 0,
      "evictions" : 0
    },
    "completion" : {
      "size_in_bytes" : 0
    },
    "segments" : {
      "count" : 228,
      "memory_in_bytes" : 1287208,
      "terms_memory_in_bytes" : 606656,
      "stored_fields_memory_in_bytes" : 454720,
      "term_vectors_memory_in_bytes" : 0,
      "norms_memory_in_bytes" : 68672,
      "points_memory_in_bytes" : 0,
      "doc_values_memory_in_bytes" : 157160,
      "index_writer_memory_in_bytes" : 531024,
      "version_map_memory_in_bytes" : 278,
      "fixed_bit_set_memory_in_bytes" : 2456,
      "max_unsafe_auto_id_timestamp" : 1606945620726,
      "file_sizes" : { }
    },
    "mappings" : {
      "field_types" : [
        {
          "name" : "binary",
          "count" : 12,
          "index_count" : 2
        },
        {
          "name" : "boolean",
          "count" : 63,
          "index_count" : 5
        },
        {
          "name" : "date",
          "count" : 121,
          "index_count" : 16
        },
        {
          "name" : "flattened",
          "count" : 10,
          "index_count" : 2
        },
        {
          "name" : "float",
          "count" : 48,
          "index_count" : 7
        },
        {
          "name" : "geo_shape",
          "count" : 1,
          "index_count" : 1
        },
        {
          "name" : "integer",
          "count" : 54,
          "index_count" : 3
        },
        {
          "name" : "keyword",
          "count" : 696,
          "index_count" : 16
        },
        {
          "name" : "long",
          "count" : 110,
          "index_count" : 13
        },
        {
          "name" : "nested",
          "count" : 23,
          "index_count" : 7
        },
        {
          "name" : "object",
          "count" : 383,
          "index_count" : 12
        },
        {
          "name" : "text",
          "count" : 282,
          "index_count" : 15
        }
      ]
    },
    "analysis" : {
      "char_filter_types" : [ ],
      "tokenizer_types" : [ ],
      "filter_types" : [ ],
      "analyzer_types" : [ ],
      "built_in_char_filters" : [ ],
      "built_in_tokenizers" : [ ],
      "built_in_filters" : [ ],
      "built_in_analyzers" : [ ]
    }
  },
  "nodes" : {
    "count" : {
      "total" : 3,
      "coordinating_only" : 0,
      "data" : 2,
      "ingest" : 3,
      "master" : 1,
      "ml" : 3,
      "remote_cluster_client" : 3,
      "transform" : 2,
      "voting_only" : 0
    },
    "versions" : [
      "7.9.2"
    ],
    "os" : {
      "available_processors" : 6,
      "allocated_processors" : 6,
      "names" : [
        {
          "name" : "Windows 10",
          "count" : 3
        }
      ],
      "pretty_names" : [
        {
          "pretty_name" : "Windows 10",
          "count" : 3
        }
      ],
      "mem" : {
        "total_in_bytes" : 12883488768,
        "free_in_bytes" : 1436160000,
        "used_in_bytes" : 11447328768,
        "free_percent" : 11,
        "used_percent" : 89
      }
    },
    "process" : {
      "cpu" : {
        "percent" : 10
      },
      "open_file_descriptors" : {
        "min" : -1,
        "max" : -1,
        "avg" : 0
      }
    },
    "jvm" : {
      "max_uptime_in_millis" : 2383646,
      "versions" : [
        {
          "version" : "15",
          "vm_name" : "OpenJDK 64-Bit Server VM",
          "vm_version" : "15+36",
          "vm_vendor" : "AdoptOpenJDK",
          "bundled_jdk" : true,
          "using_bundled_jdk" : true,
          "count" : 3
        }
      ],
      "mem" : {
        "heap_used_in_bytes" : 1353154960,
        "heap_max_in_bytes" : 3221225472
      },
      "threads" : 165
    },
    "fs" : {
      "total_in_bytes" : 90016029057024,
      "free_in_bytes" : 34689444126720,
      "available_in_bytes" : 34689444126720
    },
    "plugins" : [ ],
    "network_types" : {
      "transport_types" : {
        "netty4" : 3
      },
      "http_types" : {
        "netty4" : 3
      }
    },
    "discovery_types" : {
      "zen" : 3
    },
    "packaging_types" : [
      {
        "flavor" : "default",
        "type" : "zip",
        "count" : 3
      }
    ],
    "ingest" : {
      "number_of_pipelines" : 2,
      "processor_stats" : {
        "gsub" : {
          "count" : 0,
          "failed" : 0,
          "current" : 0,
          "time_in_millis" : 0
        },
        "script" : {
          "count" : 0,
          "failed" : 0,
          "current" : 0,
          "time_in_millis" : 0
        }
      }
    }
  }
}

Please tell us a bit more about this cluster.

Hello Christian,

I have a three node cluster implementation and I am using it to load huge amount of data.

for that I have three VMs. each VM has 69.5 GB on the machine and 4 GB RAM and 64-bit operating system, windows 10. Processor is Intel (R) Xeon (R) CPU E5-2683 V4 @ 2.5 GHz 2.10 GHz (2 processors).

all of it is used for Elasticsearch.

I appreciate your help and let me know if you need more information.

Thanks

This shows that it is taking an immense time to update cluster state which indicates that something is wrong.

What type of storage are you using? SSDs?

Do you see any evidence in the logs of long or frequent garbage collection on any of the nodes?

Hello Christian,

The issue is resolved automatically for some reason. All of the errors are gone so no further assistance needed.

Thanks for your time and help. I really appreciate that.

Thanks,
Akhil

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.