Kibana Monitoring Issue and Sharding Problem

Farid_N · April 26, 2021, 6:04am

Hi there
I have 2 problems which is probably related to each other:

I have a basic cluster with 2 nodes of data.
But my status is always "Yellow" and these 2nodes can not divide the primary shards and replicas:
So I wrote down the reroute: curl -XPOST "localhost:9200/_cluster/reroute?retry_failed=true"
It returned some logs but didn`t work properly (yet there are a number of unassigned shards)
These are some of the logs:

{"acknowledged":true,"state":{"cluster_uuid":"nembwwEEQGCpPb0zW0C8_A","version":261913,"state_uuid":"vK1LtVbzQ8-j7UNJ7DV-6w","master_node":"veBpqUZrSIC9pX1SNrTvug","blocks":{},"nodes":{"2m_WCRVhQk6SAHo6zIG6cA":{"name":"elk2","ephemeral_id":"icEWYUHrTS2lxW7lGJ9sBw","transport_address":"172.22.34.37:9300","attributes":{"ml.machine_memory":"3973464064","ml.max_open_jobs":"20","xpack.installed":"true"}},"veBpqUZrSIC9pX1SNrTvug":{"name":"elk1","ephemeral_id":"IAcFFk3TROapKhfq2FB7tg","transport_address":"172.22.34.36:9300","attributes":{"ml.machine_memory":"3973468160","ml.max_open_jobs":"20","xpack.installed":"true"}}},"routing_table":{"indices":{"testlog-2020.07.15":{"shards":{"0":[{"state":"STARTED","primary":true,"node":"veBpqUZrSIC9pX1SNrTvug","relocating_node":null,"shard":0,"index":"testlog-2020.07.15","allocation_id":{"id":"kM1UZhkxSbe1ModVTQxkmw"}},{"state":"UNASSIGNED","primary":false,"node":null,"relocating_node":null,"shard":0,"index":"testlog-2020.07.15","recovery_source":{"type":"PEER"},"unassigned_info":{"reason":"NODE_LEFT","at":"2021-04-26T05:01:20.117Z","delayed":false,"details":"node_left [2m_WCRVhQk6SAHo6zIG6cA]","allocation_status":"no_attempt"}}]}},"testlog-2020.11.27":{"shards":{"0":[{"state":"STARTED","primary":true,"node":"veBpqUZrSIC9pX1SNrTvug","relocating_node":null,"shard":0,"index":"testlog-2020.11.27","allocation_id":{"id":"WmSFM1m2T7uzCcc66R0A_w"}},{"state":"UNASSIGNED","primary":false,"node":null,"relocating_node":null,"shard":0,"index":"testlog-2020.11.27","recovery_source":{"type":"PEER"},"unassigned_info":{"reason":"NODE_LEFT","at":"2021-04-26T05:01:20.117Z","delayed":false,"details":"node_left [2m_WCRVhQk6SAHo6zIG6cA]","allocation_status":"no_attempt"}}]}},"testlog-2020.09.24":{"shards":{"0":[{"state":"STARTED","primary":true,"node":"veBpqUZrSIC9pX1SNrTvug","relocating_node":null,"shard":0,"index":"testlog-2020.09.24","allocation_id":{"id":"zWPF9tY4SpeEjlCrbWqJeA"}},{"state":"UNASSIGNED","primary":false,"node":null,"relocating_node":null,"shard":0,"index":"testlog-2020.09.24","recovery_source":{"type":"PEER"},"unassigned_info":{"reason":"NODE_LEFT","at":"2021-04-24T09:47:12.975Z","delayed":false,"details":"node_left [2m_WCRVhQk6SAHo6zIG6cA]","allocation_status":"no_attempt"}}]}}...

This is my Kibana monitoring:

Beside that
I have another problem, I have some errors when I`m in stack monitoring:

My RAM and JVM Heap is ok (each node has RAM 4G and 1.9 JVM Heap), I don`t know where is the problem?
These are the relative logs in Kibana:

{"type":"log","@timestamp":"2021-04-26T05:36:07Z","tags":["status","plugin:spaces@7.6.2","error"],"pid":4153,"state":"red","message":"Status changed from red to red - [parent] Data too large, data for [<http_request>] would be [989246376/943.4mb], which is larger than the limit of [986932838/941.2mb], real usage: [989246376/943.4mb], new bytes reserved: [0/0b], usages [request=0/0b, fielddata=44990/43.9kb, in_flight_requests=0/0b, accounting=60808081/57.9mb]: [circuit_breaking_exception] [parent] Data too large, data for [<http_request>] would be [989246376/943.4mb], which is larger than the limit of [986932838/941.2mb], real usage: [989246376/943.4mb], new bytes reserved: [0/0b], usages [request=0/0b, fielddata=44990/43.9kb, in_flight_requests=0/0b, accounting=60808081/57.9mb], with { bytes_wanted=989246376 & bytes_limit=986932838 & durability=\"PERMANENT\" }","prevState":"red","prevMsg":"[parent] Data too large, data for [<http_request>] would be [988629224/942.8mb], which is larger than the limit of [986932838/941.2mb], real usage: [988629224/942.8mb], new bytes reserved: [0/0b], usages [request=0/0b, fielddata=44990/43.9kb, in_flight_requests=0/0b, accounting=60808081/57.9mb]: [circuit_breaking_exception] [parent] Data too large, data for [<http_request>] would be [988629224/942.8mb], which is larger than the limit of [986932838/941.2mb], real usage: [988629224/942.8mb], new bytes reserved: [0/0b], usages [request=0/0b, fielddata=44990/43.9kb, in_flight_requests=0/0b, accounting=60808081/57.9mb], with { bytes_wanted=988629224 & bytes_limit=986932838 & durability=\"PERMANENT\" }"}

And these are one of the Elasticsearch logs at the moment:

[2021-04-26T10:11:53,115][DEBUG][o.e.a.g.TransportGetAction] [elk2] null: failed to execute [get [.kibana][_doc][space:default]: routing [null]]
org.elasticsearch.transport.RemoteTransportException: [elk1][172.22.34.36:9300][indices:data/read/get[s]]
Caused by: org.elasticsearch.common.breaker.CircuitBreakingException: [parent] Data too large, data for [<transport_request>] would be [990432222/944.5mb], which is larger than the limit of [986932838/941.2mb], real usage: [990431960/944.5mb], new bytes reserved: [262/262b], usages [request=0/0b, fielddata=47965/46.8kb, in_flight_requests=262/262b, accounting=60837409/58mb]
        at org.elasticsearch.indices.breaker.HierarchyCircuitBreakerService.checkParentLimit(HierarchyCircuitBreakerService.java:343) ~[elasticsearch-7.6.2.jar:7.6.2]
        at org.elasticsearch.common.breaker.ChildMemoryCircuitBreaker.addEstimateBytesAndMaybeBreak(ChildMemoryCircuitBreaker.java:128) ~[elasticsearch-7.6.2.jar:7.6.2]
        at org.elasticsearch.transport.InboundHandler.handleRequest(InboundHandler.java:171) [elasticsearch-7.6.2.jar:7.6.2]
        at org.elasticsearch.transport.InboundHandler.messageReceived(InboundHandler.java:119) [elasticsearch-7.6.2.jar:7.6.2]
        at org.elasticsearch.transport.InboundHandler.inboundMessage(InboundHandler.java:103) [elasticsearch-7.6.2.jar:7.6.2]
        at org.elasticsearch.transport.TcpTransport.inboundMessage(TcpTransport.java:667) [elasticsearch-7.6.2.jar:7.6.2]
        at org.elasticsearch.transport.netty4.Netty4MessageChannelHandler.channelRead(Netty4MessageChannelHandler.java:62) [transport-netty4-client-7.6.2.jar:7.6.2]
        at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:374) [netty-transport-4.1.43.Final.jar:4.1.43.Final]
        at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:360) [netty-transport-4.1.43.Final.jar:4.1.43.Final]
        at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:352) [netty-transport-4.1.43.Final.jar:4.1.43.Final]
        at io.netty.handler.codec.ByteToMessageDecoder.fireChannelRead(ByteToMessageDecoder.java:326) [netty-codec-4.1.43.Final.jar:4.1.43.Final]
        at io.netty.handler.codec.ByteToMessageDecoder.channelRead(ByteToMessageDecoder.java:300) [netty-codec-4.1.43.Final.jar:4.1.43.Final]
        at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:374) [netty-transport-4.1.43.Final.jar:4.1.43.Final]
        at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:360) [netty-transport-4.1.43.Final.jar:4.1.43.Final]
        at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:352) [netty-transport-4.1.43.Final.jar:4.1.43.Final]
        at io.netty.handler.logging.LoggingHandler.channelRead(LoggingHandler.java:241) [netty-handler-4.1.43.Final.jar:4.1.43.Final]
        at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:374) [netty-transport-4.1.43.Final.jar:4.1.43.Final]
        at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:360) [netty-transport-4.1.43.Final.jar:4.1.43.Final]
        at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:352) [netty-transport-4.1.43.Final.jar:4.1.43.Final]
        at io.netty.channel.DefaultChannelPipeline$HeadContext.channelRead(DefaultChannelPipeline.java:1422) [netty-transport-4.1.43.Final.jar:4.1.43.Final]
        at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:374) [netty-transport-4.1.43.Final.jar:4.1.43.Final]
        at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:360) [netty-transport-4.1.43.Final.jar:4.1.43.Final]
        at io.netty.channel.DefaultChannelPipeline.fireChannelRead(DefaultChannelPipeline.java:931) [netty-transport-4.1.43.Final.jar:4.1.43.Final]
        at io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:163) [netty-transport-4.1.43.Final.jar:4.1.43.Final]
        at io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:700) [netty-transport-4.1.43.Final.jar:4.1.43.Final]
        at io.netty.channel.nio.NioEventLoop.processSelectedKeysPlain(NioEventLoop.java:600) [netty-transport-4.1.43.Final.jar:4.1.43.Final]
        at io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:554) [netty-transport-4.1.43.Final.jar:4.1.43.Final]
        at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:514) [netty-transport-4.1.43.Final.jar:4.1.43.Final]
        at io.netty.util.concurrent.SingleThreadEventExecutor$6.run(SingleThreadEventExecutor.java:1050) [netty-common-4.1.43.Final.jar:4.1.43.Final]
        at io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74) [netty-common-4.1.43.Final.jar:4.1.43.Final]
        at java.lang.Thread.run(Thread.java:830) [?:?]

Could you help me about these problems?
How can I fix this monitoring issue and why my status does not change to "Green" ??

Thanks in advance

warkolm · April 27, 2021, 1:02am

TLDR you have way too many shards for your heap size. Either increase the heap size or reduce your shard count.

Farid_N · April 27, 2021, 7:13am

Hi
Yes, you`re right. The problem was my heap size. I have to increase the amount of RAM to assign more heap.
Thank you so much

Farid_N · May 3, 2021, 11:23am

Hi again,
I have increased my shards but this problem still remains:

This is just a part of my elasticsearch log:

[2021-05-03T15:46:20,196][WARN ][o.e.i.IndexService       ] [elk2] [testlog-2020.12.14] failed to write dangling indices state for index [testlog-2020.12.14/vgLF61z6SB2-LKXkyj03vQ]
org.elasticsearch.gateway.WriteStateException: exception during looking up new generation id
        at org.elasticsearch.gateway.MetaDataStateFormat.write(MetaDataStateFormat.java:225) ~[elasticsearch-7.6.2.jar:7.6.2]
        at org.elasticsearch.gateway.MetaDataStateFormat.writeAndCleanup(MetaDataStateFormat.java:185) ~[elasticsearch-7.6.2.jar:7.6.2]
        at org.elasticsearch.index.IndexService.writeDanglingIndicesInfo(IndexService.java:337) [elasticsearch-7.6.2.jar:7.6.2]
        at org.elasticsearch.indices.IndicesService$5.doRun(IndicesService.java:1559) [elasticsearch-7.6.2.jar:7.6.2]
        at org.elasticsearch.common.util.concurrent.ThreadContext$ContextPreservingAbstractRunnable.doRun(ThreadContext.java:692) [elasticsearch-7.6.2.jar:7.6.2]
        at org.elasticsearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:37) [elasticsearch-7.6.2.jar:7.6.2]
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128) [?:?]
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628) [?:?]
        at java.lang.Thread.run(Thread.java:830) [?:?]
Caused by: java.nio.file.FileSystemException: /var/lib/elasticsearch/nodes/0/indices/vgLF61z6SB2-LKXkyj03vQ/_state: Too many open files
        at sun.nio.fs.UnixException.translateToIOException(UnixException.java:100) ~[?:?]
        at sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:111) ~[?:?]
        at sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:116) ~[?:?]
        at sun.nio.fs.UnixFileSystemProvider.newDirectoryStream(UnixFileSystemProvider.java:432) ~[?:?]
        at java.nio.file.Files.newDirectoryStream(Files.java:543) ~[?:?]
        at org.elasticsearch.gateway.MetaDataStateFormat.findMaxGenerationId(MetaDataStateFormat.java:353) ~[elasticsearch-7.6.2.jar:7.6.2]
        at org.elasticsearch.gateway.MetaDataStateFormat.write(MetaDataStateFormat.java:222) ~[elasticsearch-7.6.2.jar:7.6.2]
        ... 8 more
[2021-05-03T15:46:20,200][WARN ][o.e.i.IndexService       ] [elk2] [testlog-2021.01.25] failed to write dangling indices state for index [testlog-2021.01.25/2fbsoSMhSnK2ecyehTW75w]
org.elasticsearch.gateway.WriteStateException: exception during looking up new generation id
        at org.elasticsearch.gateway.MetaDataStateFormat.write(MetaDataStateFormat.java:225) ~[elasticsearch-7.6.2.jar:7.6.2]
        at org.elasticsearch.gateway.MetaDataStateFormat.writeAndCleanup(MetaDataStateFormat.java:185) ~[elasticsearch-7.6.2.jar:7.6.2]
        at org.elasticsearch.index.IndexService.writeDanglingIndicesInfo(IndexService.java:337) [elasticsearch-7.6.2.jar:7.6.2]
        at org.elasticsearch.indices.IndicesService$5.doRun(IndicesService.java:1559) [elasticsearch-7.6.2.jar:7.6.2]
        at org.elasticsearch.common.util.concurrent.ThreadContext$ContextPreservingAbstractRunnable.doRun(ThreadContext.java:692) [elasticsearch-7.6.2.jar:7.6.2]
        at org.elasticsearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:37) [elasticsearch-7.6.2.jar:7.6.2]
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128) [?:?]
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628) [?:?]
        at java.lang.Thread.run(Thread.java:830) [?:?]
Caused by: java.nio.file.FileSystemException: /var/lib/elasticsearch/nodes/0/indices/2fbsoSMhSnK2ecyehTW75w/_state: Too many open files
        at sun.nio.fs.UnixException.translateToIOException(UnixException.java:100) ~[?:?]
        at sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:111) ~[?:?]
        at sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:116) ~[?:?]
        at sun.nio.fs.UnixFileSystemProvider.newDirectoryStream(UnixFileSystemProvider.java:432) ~[?:?]
        at java.nio.file.Files.newDirectoryStream(Files.java:543) ~[?:?]
        at org.elasticsearch.gateway.MetaDataStateFormat.findMaxGenerationId(MetaDataStateFormat.java:353) ~[elasticsearch-7.6.2.jar:7.6.2]
        at org.elasticsearch.gateway.MetaDataStateFormat.write(MetaDataStateFormat.java:222) ~[elasticsearch-7.6.2.jar:7.6.2]
[2021-05-03T15:49:46,486][WARN ][o.e.i.s.IndexShard       ] [elk2] [testlog-2019.12.10][0] failed to turn off translog retention
org.apache.lucene.store.AlreadyClosedException: engine is closed
        at org.elasticsearch.index.shard.IndexShard.getEngine(IndexShard.java:2528) ~[elasticsearch-7.6.2.jar:7.6.2]
        at org.elasticsearch.index.shard.IndexShard.trimTranslog(IndexShard.java:1106) ~[elasticsearch-7.6.2.jar:7.6.2]
        at org.elasticsearch.index.shard.IndexShard$3.doRun(IndexShard.java:1944) [elasticsearch-7.6.2.jar:7.6.2]
        at org.elasticsearch.common.util.concurrent.ThreadContext$ContextPreservingAbstractRunnable.doRun(ThreadContext.java:692) [elasticsearch-7.6.2.jar:7.6.2]
        at org.elasticsearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:37) [elasticsearch-7.6.2.jar:7.6.2]
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128) [?:?]
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628) [?:?]
        at java.lang.Thread.run(Thread.java:830) [?:?]
[2021-05-03T15:49:46,486][WARN ][o.e.i.s.IndexShard       ] [elk2] [testlog-2019.11.26][0] failed to turn off translog retention
org.apache.lucene.store.AlreadyClosedException: engine is closed
        at org.elasticsearch.index.shard.IndexShard.getEngine(IndexShard.java:2528) ~[elasticsearch-7.6.2.jar:7.6.2]
        at org.elasticsearch.index.shard.IndexShard.trimTranslog(IndexShard.java:1106) ~[elasticsearch-7.6.2.jar:7.6.2]
        at org.elasticsearch.index.shard.IndexShard$3.doRun(IndexShard.java:1944) [elasticsearch-7.6.2.jar:7.6.2]
        at org.elasticsearch.common.util.concurrent.ThreadContext$ContextPreservingAbstractRunnable.doRun(ThreadContext.java:692) [elasticsearch-7.6.2.jar:7.6.2]
        at org.elasticsearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:37) [elasticsearch-7.6.2.jar:7.6.2]
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128) [?:?]
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628) [?:?]
        at java.lang.Thread.run(Thread.java:830) [?:?]

I already wrote curl -XPOST "localhost:9200/_cluster/reroute?retry_failed=true" in order to redistribute unassigned shards.
But, after that, my node2 releases all data and allocates shards from the beginning, then stop about the number of above picture...

Farid_N · May 3, 2021, 11:27am

 curl -XGET localhost:9200/_cluster/allocation/explain?pretty
{
  "index" : "testlog-2020.05.27",
  "shard" : 0,
  "primary" : false,
  "current_state" : "unassigned",
  "unassigned_info" : {
    "reason" : "ALLOCATION_FAILED",
    "at" : "2021-05-03T11:17:24.408Z",
    "failed_allocation_attempts" : 5,
    "details" : "failed shard on node [2m_WCRVhQk6SAHo6zIG6cA]: failed recovery, failure RecoveryFailedException[[testlog-2020.05.27][0]: Recovery failed from {elk1}{veBpqUZrSIC9pX1SNrTvug}{whzfgE4zRBSTfdX5JTYJDg}{172.22.34.36}{172.22.34.36:9300}{dilm}{ml.machine_memory=8201326592, ml.max_open_jobs=20, xpack.installed=true} into {elk2}{2m_WCRVhQk6SAHo6zIG6cA}{5RfsPh5dT3yMGUyO6gCpMQ}{172.22.34.37}{172.22.34.37:9300}{dilm}{ml.machine_memory=8201322496, xpack.installed=true, ml.max_open_jobs=20}]; nested: RemoteTransportException[[elk1][172.22.34.36:9300][internal:index/shard/recovery/start_recovery]]; nested: RecoveryEngineException[Phase[1] prepare target for translog failed]; nested: RemoteTransportException[[elk2][172.22.34.37:9300][internal:index/shard/recovery/prepare_translog]]; nested: EngineCreationFailureException[failed to create engine]; nested: FileSystemException[/var/lib/elasticsearch/nodes/0/indices/mEePgYxBSNKI9Jpi2yOSnA/0/index: Too many open files]; ",
    "last_allocation_status" : "no_attempt"
  },
  "can_allocate" : "no",
  "allocate_explanation" : "cannot allocate because allocation is not permitted to any of the nodes",
  "node_allocation_decisions" : [
    {
      "node_id" : "2m_WCRVhQk6SAHo6zIG6cA",
      "node_name" : "elk2",
      "transport_address" : "172.22.34.37:9300",
      "node_attributes" : {
        "ml.machine_memory" : "8201322496",
        "ml.max_open_jobs" : "20",
        "xpack.installed" : "true"
      },
      "node_decision" : "no",
      "deciders" : [
        {
          "decider" : "max_retry",
          "decision" : "NO",
          "explanation" : "shard has exceeded the maximum number of retries [5] on failed allocation attempts - manually call [/_cluster/reroute?retry_failed=true] to retry, [unassigned_info[[reason=ALLOCATION_FAILED], at[2021-05-03T11:17:24.408Z], failed_attempts[5], failed_nodes[[2m_WCRVhQk6SAHo6zIG6cA]], delayed=false, details[failed shard on node [2m_WCRVhQk6SAHo6zIG6cA]: failed recovery, failure RecoveryFailedException[[testlog-2020.05.27][0]: Recovery failed from {elk1}{veBpqUZrSIC9pX1SNrTvug}{whzfgE4zRBSTfdX5JTYJDg}{172.22.34.36}{172.22.34.36:9300}{dilm}{ml.machine_memory=8201326592, ml.max_open_jobs=20, xpack.installed=true} into {elk2}{2m_WCRVhQk6SAHo6zIG6cA}{5RfsPh5dT3yMGUyO6gCpMQ}{172.22.34.37}{172.22.34.37:9300}{dilm}{ml.machine_memory=8201322496, xpack.installed=true, ml.max_open_jobs=20}]; nested: RemoteTransportException[[elk1][172.22.34.36:9300][internal:index/shard/recovery/start_recovery]]; nested: RecoveryEngineException[Phase[1] prepare target for translog failed]; nested: RemoteTransportException[[elk2][172.22.34.37:9300][internal:index/shard/recovery/prepare_translog]]; nested: EngineCreationFailureException[failed to create engine]; nested: FileSystemException[/var/lib/elasticsearch/nodes/0/indices/mEePgYxBSNKI9Jpi2yOSnA/0/index: Too many open files]; ], allocation_status[no_attempt]]]"
        },
        {
          "decider" : "throttling",
          "decision" : "THROTTLE",
          "explanation" : "reached the limit of incoming shard recoveries [2], cluster setting [cluster.routing.allocation.node_concurrent_incoming_recoveries=2] (can also be set via [cluster.routing.allocation.node_concurrent_recoveries])"
        }
      ]
    },
    {
      "node_id" : "veBpqUZrSIC9pX1SNrTvug",
      "node_name" : "elk1",
      "transport_address" : "172.22.34.36:9300",
      "node_attributes" : {
        "ml.machine_memory" : "8201326592",
        "ml.max_open_jobs" : "20",
        "xpack.installed" : "true"
      },
      "node_decision" : "no",
      "deciders" : [
        {
          "decider" : "max_retry",
          "decision" : "NO",
          "explanation" : "shard has exceeded the maximum number of retries [5] on failed allocation attempts - manually call [/_cluster/reroute?retry_failed=true] to retry, [unassigned_info[[reason=ALLOCATION_FAILED], at[2021-05-03T11:17:24.408Z], failed_attempts[5], failed_nodes[[2m_WCRVhQk6SAHo6zIG6cA]], delayed=false, details[failed shard on node [2m_WCRVhQk6SAHo6zIG6cA]: failed recovery, failure RecoveryFailedException[[testlog-2020.05.27][0]: Recovery failed from {elk1}{veBpqUZrSIC9pX1SNrTvug}{whzfgE4zRBSTfdX5JTYJDg}{172.22.34.36}{172.22.34.36:9300}{dilm}{ml.machine_memory=8201326592, ml.max_open_jobs=20, xpack.installed=true} into {elk2}{2m_WCRVhQk6SAHo6zIG6cA}{5RfsPh5dT3yMGUyO6gCpMQ}{172.22.34.37}{172.22.34.37:9300}{dilm}{ml.machine_memory=8201322496, xpack.installed=true, ml.max_open_jobs=20}]; nested: RemoteTransportException[[elk1][172.22.34.36:9300][internal:index/shard/recovery/start_recovery]]; nested: RecoveryEngineException[Phase[1] prepare target for translog failed]; nested: RemoteTransportException[[elk2][172.22.34.37:9300][internal:index/shard/recovery/prepare_translog]]; nested: EngineCreationFailureException[failed to create engine]; nested: FileSystemException[/var/lib/elasticsearch/nodes/0/indices/mEePgYxBSNKI9Jpi2yOSnA/0/index: Too many open files]; ], allocation_status[no_attempt]]]"
        },
        {
          "decider" : "same_shard",
          "decision" : "NO",
          "explanation" : "the shard cannot be allocated to the same node on which a copy of the shard already exists [[testlog-2020.05.27][0], node[veBpqUZrSIC9pX1SNrTvug], [P], s[STARTED], a[id=_RJkffb1Q9uHuq8Jwq3TVg]]"
        },
        {
          "decider" : "throttling",
          "decision" : "THROTTLE",
          "explanation" : "reached the limit of outgoing shard recoveries [2] on the node [veBpqUZrSIC9pX1SNrTvug] which holds the primary, cluster setting [cluster.routing.allocation.node_concurrent_outgoing_recoveries=2] (can also be set via [cluster.routing.allocation.node_concurrent_recoveries])"
        }
      ]
    }
  ]
}

warkolm · May 3, 2021, 10:47pm

Did you set File Descriptors | Elasticsearch Guide [7.12] | Elastic

Farid_N · May 5, 2021, 11:02am

I already set in /etc/security/limits.conf
elasticsearch nofile 65535
But my unassigned shard won't be fixed

Farid_N · May 5, 2021, 11:04am

[2021-05-05T15:00:07,220][WARN ][o.e.c.r.a.AllocationService] [elk1] failing shard [failed shard, shard [.monitoring-kibana-7-2021.05.05][0], node[2m_WCRVhQk6SAHo6zIG6cA], [R], recovery_source[peer recovery], s[INITIALIZING], a[id=otqL0ujwSlisYT2lOBGGDQ], unassigned_info[[reason=ALLOCATION_FAILED], at[2021-05-05T10:30:06.517Z], failed_attempts[4], failed_nodes[[2m_WCRVhQk6SAHo6zIG6cA]], delayed=false, details[failed shard on node [2m_WCRVhQk6SAHo6zIG6cA]: shard failure, reason [refresh failed source[peer-recovery]], failure FileSystemException[/var/lib/elasticsearch/nodes/0/indices/Usy1AOLNRDiOK09K3jMvmA/0/index/_1_Lucene80_0.dvm: Too many open files]], allocation_status[no_attempt]], expected_shard_size[1047435], message [shard failure, reason [lucene commit failed]], failure [FileSystemException[/var/lib/elasticsearch/nodes/0/indices/Usy1AOLNRDiOK09K3jMvmA/0/index/_2_Lucene80_0.dvd: Too many open files]], markAsStale [true]]
java.nio.file.FileSystemException: /var/lib/elasticsearch/nodes/0/indices/Usy1AOLNRDiOK09K3jMvmA/0/index/_2_Lucene80_0.dvd: Too many open files
        at sun.nio.fs.UnixException.translateToIOException(UnixException.java:100) ~[?:?]
        at sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:111) ~[?:?]
        at sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:116) ~[?:?]
        at sun.nio.fs.UnixFileSystemProvider.newByteChannel(UnixFileSystemProvider.java:219) ~[?:?]
        at java.nio.file.spi.FileSystemProvider.newOutputStream(FileSystemProvider.java:478) ~[?:?]
        at java.nio.file.Files.newOutputStream(Files.java:223) ~[?:?]
        at org.apache.lucene.store.FSDirectory$FSIndexOutput.<init>(FSDirectory.java:410) ~[lucene-core-8.4.0.jar:8.4.0 bc02ab906445fcf4e297f4ef00ab4a54fdd72ca2 - jpountz - 2019-12-19 20:16:14]
        at org.apache.lucene.store.FSDirectory$FSIndexOutput.<init>(FSDirectory.java:406) ~[lucene-core-8.4.0.jar:8.4.0 bc02ab906445fcf4e297f4ef00ab4a54fdd72ca2 - jpountz - 2019-12-19 20:16:14]
        at org.apache.lucene.store.FSDirectory.createOutput(FSDirectory.java:254) ~[lucene-core-8.4.0.jar:8.4.0 bc02ab906445fcf4e297f4ef00ab4a54fdd72ca2 - jpountz - 2019-12-19 20:16:14]
        at org.apache.lucene.store.FilterDirectory.createOutput(FilterDirectory.java:74) ~[lucene-core-8.4.0.jar:8.4.0 bc02ab906445fcf4e297f4ef00ab4a54fdd72ca2 - jpountz - 2019-12-19 20:16:14]
        at org.elasticsearch.index.store.ByteSizeCachingDirectory.createOutput(ByteSizeCachingDirectory.java:130) ~[elasticsearch-7.6.2.jar:7.6.2]
        at org.apache.lucene.store.FilterDirectory.createOutput(FilterDirectory.java:74) ~[lucene-core-8.4.0.jar:8.4.0 bc02ab906445fcf4e297f4ef00ab4a54fdd72ca2 - jpountz - 2019-12-19 20:16:14]
        at org.apache.lucene.store.LockValidatingDirectoryWrapper.createOutput(LockValidatingDirectoryWrapper.java:44) ~[lucene-core-8.4.0.jar:8.4.0 bc02ab906445fcf4e297f4ef00ab4a54fdd72ca2 - jpountz - 2019-12-19 20:16:14]
        at org.apache.lucene.store.TrackingDirectoryWrapper.createOutput(TrackingDirectoryWrapper.java:43) ~[lucene-core-8.4.0.jar:8.4.0 bc02ab906445fcf4e297f4ef00ab4a54fdd72ca2 - jpountz - 2019-12-19 20:16:14]
        at org.apache.lucene.codecs.lucene80.Lucene80DocValuesConsumer.<init>(Lucene80DocValuesConsumer.java:70) ~[lucene-core-8.4.0.jar:8.4.0 bc02ab906445fcf4e297f4ef00ab4a54fdd72ca2 - jpountz - 2019-12-19 20:16:14]
        at org.apache.lucene.codecs.lucene80.Lucene80DocValuesFormat.fieldsConsumer(Lucene80DocValuesFormat.java:141) ~[lucene-core-8.4.0.jar:8.4.0 bc02ab906445fcf4e297f4ef00ab4a54fdd72ca2 - jpountz - 2019-12-19 20:16:14]
        at org.apache.lucene.codecs.perfield.PerFieldDocValuesFormat$FieldsWriter.getInstance(PerFieldDocValuesFormat.java:224) ~[lucene-core-8.4.0.jar:8.4.0 bc02ab906445fcf4e297f4ef00ab4a54fdd72ca2 - jpountz - 2019-12-19 20:16:14]
        at org.apache.lucene.codecs.perfield.PerFieldDocValuesFormat$FieldsWriter.getInstance(PerFieldDocValuesFormat.java:160) ~[lucene-core-8.4.0.jar:8.4.0 bc02ab906445fcf4e297f4ef00ab4a54fdd72ca2 - jpountz - 2019-12-19 20:16:14]
        at org.apache.lucene.codecs.perfield.PerFieldDocValuesFormat$FieldsWriter.addSortedSetField(PerFieldDocValuesFormat.java:129) ~[lucene-core-8.4.0.jar:8.4.0 bc02ab906445fcf4e297f4ef00ab4a54fdd72ca2 - jpountz - 2019-12-19 20:16:14]
        at org.apache.lucene.index.SortedSetDocValuesWriter.flush(SortedSetDocValuesWriter.java:221) ~[lucene-core-8.4.0.jar:8.4.0 bc02ab906445fcf4e297f4ef00ab4a54fdd72ca2 - jpountz - 2019-12-19 20:16:14]
        at org.apache.lucene.index.DefaultIndexingChain.writeDocValues(DefaultIndexingChain.java:263) ~[lucene-core-8.4.0.jar:8.4.0 bc02ab906445fcf4e297f4ef00ab4a54fdd72ca2 - jpountz - 2019-12-19 20:16:14]
        at org.apache.lucene.index.DefaultIndexingChain.flush(DefaultIndexingChain.java:138) ~[lucene-core-8.4.0.jar:8.4.0 bc02ab906445fcf4e297f4ef00ab4a54fdd72ca2 - jpountz - 2019-12-19 20:16:14]
        at org.apache.lucene.index.DocumentsWriterPerThread.flush(DocumentsWriterPerThread.java:468) ~[lucene-core-8.4.0.jar:8.4.0 bc02ab906445fcf4e297f4ef00ab4a54fdd72ca2 - jpountz - 2019-12-19 20:16:14]
        at org.apache.lucene.index.DocumentsWriter.doFlush(DocumentsWriter.java:555) ~[lucene-core-8.4.0.jar:8.4.0 bc02ab906445fcf4e297f4ef00ab4a54fdd72ca2 - jpountz - 2019-12-19 20:16:14]
        at org.apache.lucene.index.DocumentsWriter.flushAllThreads(DocumentsWriter.java:722) ~[lucene-core-8.4.0.jar:8.4.0 bc02ab906445fcf4e297f4ef00ab4a54fdd72ca2 - jpountz - 2019-12-19 20:16:14]
        at org.apache.lucene.index.IndexWriter.prepareCommitInternal(IndexWriter.java:3200) ~[lucene-core-8.4.0.jar:8.4.0 bc02ab906445fcf4e297f4ef00ab4a54fdd72ca2 - jpountz - 2019-12-19 20:16:14]
        at org.apache.lucene.index.IndexWriter.commitInternal(IndexWriter.java:3445) ~[lucene-core-8.4.0.jar:8.4.0 bc02ab906445fcf4e297f4ef00ab4a54fdd72ca2 - jpountz - 2019-12-19 20:16:14]
        at org.apache.lucene.index.IndexWriter.commit(IndexWriter.java:3410) ~[lucene-core-8.4.0.jar:8.4.0 bc02ab906445fcf4e297f4ef00ab4a54fdd72ca2 - jpountz - 2019-12-19 20:16:14]
        at org.elasticsearch.index.engine.InternalEngine.commitIndexWriter(InternalEngine.java:2456) ~[elasticsearch-7.6.2.jar:7.6.2]
        at org.elasticsearch.index.engine.InternalEngine.recoverFromTranslogInternal(InternalEngine.java:493) ~[elasticsearch-7.6.2.jar:7.6.2]
        at org.elasticsearch.index.engine.InternalEngine.recoverFromTranslog(InternalEngine.java:453) ~[elasticsearch-7.6.2.jar:7.6.2]
        at org.elasticsearch.index.engine.InternalEngine.recoverFromTranslog(InternalEngine.java:131) ~[elasticsearch-7.6.2.jar:7.6.2]
        at org.elasticsearch.index.shard.IndexShard.recoverLocallyUpToGlobalCheckpoint(IndexShard.java:1445) ~[elasticsearch-7.6.2.jar:7.6.2]
        at org.elasticsearch.indices.recovery.PeerRecoveryTargetService.doRecovery(PeerRecoveryTargetService.java:178) ~[elasticsearch-7.6.2.jar:7.6.2]
        at org.elasticsearch.indices.recovery.PeerRecoveryTargetService.access$500(PeerRecoveryTargetService.java:79) ~[elasticsearch-7.6.2.jar:7.6.2]
        at org.elasticsearch.indices.recovery.PeerRecoveryTargetService$RecoveryRunner.doRun(PeerRecoveryTargetService.java:563) ~[elasticsearch-7.6.2.jar:7.6.2]
        at org.elasticsearch.common.util.concurrent.ThreadContext$ContextPreservingAbstractRunnable.doRun(ThreadContext.java:692) ~[elasticsearch-7.6.2.jar:7.6.2]
        at org.elasticsearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:37) ~[elasticsearch-7.6.2.jar:7.6.2]
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128) [?:?]
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628) [?:?]
        at java.lang.Thread.run(Thread.java:830) [?:?]
[2021-05-05T15:09:06,449][INFO ][o.e.c.m.MetaDataMappingService] [elk1] [testlog-2021.05.05/n6VhET9_SOWlRnVI73VKbQ] update_mapping [_doc]

Farid_N · May 5, 2021, 11:05am

It seems said Too many open files again... What should I do?

warkolm · May 5, 2021, 11:37pm

What does the output of GET _nodes/stats/process?filter_path=**.max_file_descriptors show?

Farid_N · May 8, 2021, 4:09am

{
  "nodes" : {
    "2m_WCRVhQk6SAHo6zIG6cA" : {
      "process" : {
        "max_file_descriptors" : 65535
      }
    },
    "veBpqUZrSIC9pX1SNrTvug" : {
      "process" : {
        "max_file_descriptors" : 65535
      }
    }
  }
}

Farid_N · May 8, 2021, 8:05am

These are my logs:
Node master:

[2021-05-08T01:30:00,001][INFO ][o.e.x.m.MlDailyMaintenanceService] [elk1] triggering scheduled [ML] maintenance tasks
[2021-05-08T01:30:00,016][INFO ][o.e.x.m.a.TransportDeleteExpiredDataAction] [elk1] Deleting expired data
[2021-05-08T01:30:00,018][INFO ][o.e.x.m.a.TransportDeleteExpiredDataAction] [elk1] Completed deletion of expired ML data
[2021-05-08T01:30:00,019][INFO ][o.e.x.m.MlDailyMaintenanceService] [elk1] Successfully completed [ML] maintenance tasks
[2021-05-08T04:30:03,655][INFO ][o.e.c.m.MetaDataCreateIndexService] [elk1] [.monitoring-es-7-2021.05.08] creating index, cause [auto(bulk api)], templates [.monitoring-es], shards [1]/[0], mappings [_doc]
[2021-05-08T04:30:03,660][INFO ][o.e.c.r.a.AllocationService] [elk1] updating number_of_replicas to [1] for indices [.monitoring-es-7-2021.05.08]
[2021-05-08T04:30:08,631][INFO ][o.e.c.m.MetaDataCreateIndexService] [elk1] [.monitoring-kibana-7-2021.05.08] creating index, cause [auto(bulk api)], templates [.monitoring-kibana], shards [1]/[0], mappings [_doc]
[2021-05-08T04:30:08,633][INFO ][o.e.c.r.a.AllocationService] [elk1] updating number_of_replicas to [1] for indices [.monitoring-kibana-7-2021.05.08]
[2021-05-08T04:35:01,271][INFO ][o.e.c.m.MetaDataCreateIndexService] [elk1] [testlog-2021.05.08] creating index, cause [auto(bulk api)], templates [], shards [1]/[1], mappings []
[2021-05-08T04:35:01,534][INFO ][o.e.c.m.MetaDataMappingService] [elk1] [testlog-2021.05.08/ZuWfptfDQte2_PfC7rXvMQ] create_mapping [_doc]
[2021-05-08T04:35:01,537][INFO ][o.e.c.m.MetaDataMappingService] [elk1] [testlog-2021.05.08/ZuWfptfDQte2_PfC7rXvMQ] update_mapping [_doc]
[2021-05-08T05:30:00,000][INFO ][o.e.x.m.e.l.LocalExporter] [elk1] cleaning up [2] old indices
[2021-05-08T05:30:00,001][INFO ][o.e.c.m.MetaDataDeleteIndexService] [elk1] [.monitoring-es-7-2021.05.01/bY8CPQ7WSm-No7JKVLN-kQ] deleting index
[2021-05-08T05:30:00,001][INFO ][o.e.c.m.MetaDataDeleteIndexService] [elk1] [.monitoring-kibana-7-2021.05.01/hWDyrZf2R-uq-EHveOl9Fw] deleting index
[2021-05-08T06:00:00,002][INFO ][o.e.x.s.SnapshotRetentionTask] [elk1] starting SLM retention snapshot cleanup task
[2021-05-08T07:25:01,387][INFO ][o.e.c.m.MetaDataMappingService] [elk1] [testlog-2021.05.08/ZuWfptfDQte2_PfC7rXvMQ] update_mapping [_doc]
[2021-05-08T07:35:01,046][INFO ][o.e.c.m.MetaDataMappingService] [elk1] [testlog-2021.05.08/ZuWfptfDQte2_PfC7rXvMQ] update_mapping [_doc]
[2021-05-08T07:35:01,105][INFO ][o.e.c.m.MetaDataMappingService] [elk1] [testlog-2021.05.08/ZuWfptfDQte2_PfC7rXvMQ] update_mapping [_doc]
[2021-05-08T09:18:51,885][INFO ][o.e.c.m.MetaDataMappingService] [elk1] [testlog-2021.05.08/ZuWfptfDQte2_PfC7rXvMQ] update_mapping [_doc]
[2021-05-08T09:18:52,190][INFO ][o.e.c.m.MetaDataMappingService] [elk1] [testlog-2021.05.08/ZuWfptfDQte2_PfC7rXvMQ] update_mapping [_doc]
[2021-05-08T09:24:32,923][INFO ][o.e.c.m.MetaDataMappingService] [elk1] [testlog-2021.05.08/ZuWfptfDQte2_PfC7rXvMQ] update_mapping [_doc]
[2021-05-08T09:25:24,440][INFO ][o.e.c.m.MetaDataMappingService] [elk1] [testlog-2021.05.08/ZuWfptfDQte2_PfC7rXvMQ] update_mapping [_doc]
[2021-05-08T09:49:46,484][INFO ][o.e.c.m.MetaDataMappingService] [elk1] [testlog-2021.05.08/ZuWfptfDQte2_PfC7rXvMQ] update_mapping [_doc]
[2021-05-08T10:39:42,377][INFO ][o.e.c.m.MetaDataMappingService] [elk1] [testlog-2021.05.08/ZuWfptfDQte2_PfC7rXvMQ] update_mapping [_doc]
[2021-05-08T10:39:42,443][INFO ][o.e.c.m.MetaDataMappingService] [elk1] [testlog-2021.05.08/ZuWfptfDQte2_PfC7rXvMQ] update_mapping [_doc]
[2021-05-08T10:39:42,445][INFO ][o.e.c.m.MetaDataMappingService] [elk1] [testlog-2021.05.08/ZuWfptfDQte2_PfC7rXvMQ] update_mapping [_doc]
[2021-05-08T10:39:42,513][INFO ][o.e.c.m.MetaDataMappingService] [elk1] [testlog-2021.05.08/ZuWfptfDQte2_PfC7rXvMQ] update_mapping [_doc]
[2021-05-08T10:39:42,582][INFO ][o.e.c.m.MetaDataMappingService] [elk1] [testlog-2021.05.08/ZuWfptfDQte2_PfC7rXvMQ] update_mapping [_doc]
[2021-05-08T10:39:42,634][INFO ][o.e.c.m.MetaDataMappingService] [elk1] [testlog-2021.05.08/ZuWfptfDQte2_PfC7rXvMQ] update_mapping [_doc]
[2021-05-08T10:39:42,637][INFO ][o.e.c.m.MetaDataMappingService] [elk1] [testlog-2021.05.08/ZuWfptfDQte2_PfC7rXvMQ] update_mapping [_doc]
[2021-05-08T10:39:42,641][INFO ][o.e.c.m.MetaDataMappingService] [elk1] [testlog-2021.05.08/ZuWfptfDQte2_PfC7rXvMQ] update_mapping [_doc]
[2021-05-08T10:39:42,697][INFO ][o.e.c.m.MetaDataMappingService] [elk1] [testlog-2021.05.08/ZuWfptfDQte2_PfC7rXvMQ] update_mapping [_doc]
[2021-05-08T10:39:42,743][INFO ][o.e.c.m.MetaDataMappingService] [elk1] [testlog-2021.05.08/ZuWfptfDQte2_PfC7rXvMQ] update_mapping [_doc]
[2021-05-08T10:39:42,746][INFO ][o.e.c.m.MetaDataMappingService] [elk1] [testlog-2021.05.08/ZuWfptfDQte2_PfC7rXvMQ] update_mapping [_doc]
[2021-05-08T10:39:42,749][INFO ][o.e.c.m.MetaDataMappingService] [elk1] [testlog-2021.05.08/ZuWfptfDQte2_PfC7rXvMQ] update_mapping [_doc]
[2021-05-08T10:39:42,798][INFO ][o.e.c.m.MetaDataMappingService] [elk1] [testlog-2021.05.08/ZuWfptfDQte2_PfC7rXvMQ] update_mapping [_doc]
[2021-05-08T10:39:42,801][INFO ][o.e.c.m.MetaDataMappingService] [elk1] [testlog-2021.05.08/ZuWfptfDQte2_PfC7rXvMQ] update_mapping [_doc]
[2021-05-08T10:39:42,853][INFO ][o.e.c.m.MetaDataMappingService] [elk1] [testlog-2021.05.08/ZuWfptfDQte2_PfC7rXvMQ] update_mapping [_doc]
[2021-05-08T10:39:42,855][INFO ][o.e.c.m.MetaDataMappingService] [elk1] [testlog-2021.05.08/ZuWfptfDQte2_PfC7rXvMQ] update_mapping [_doc]
[2021-05-08T10:39:42,921][INFO ][o.e.c.m.MetaDataMappingService] [elk1] [testlog-2021.05.08/ZuWfptfDQte2_PfC7rXvMQ] update_mapping [_doc]
[2021-05-08T10:39:42,925][INFO ][o.e.c.m.MetaDataMappingService] [elk1] [testlog-2021.05.08/ZuWfptfDQte2_PfC7rXvMQ] update_mapping [_doc]
[2021-05-08T10:39:42,927][INFO ][o.e.c.m.MetaDataMappingService] [elk1] [testlog-2021.05.08/ZuWfptfDQte2_PfC7rXvMQ] update_mapping [_doc]
[2021-05-08T10:39:42,999][INFO ][o.e.c.m.MetaDataMappingService] [elk1] [testlog-2021.05.08/ZuWfptfDQte2_PfC7rXvMQ] update_mapping [_doc]
[2021-05-08T10:39:43,002][INFO ][o.e.c.m.MetaDataMappingService] [elk1] [testlog-2021.05.08/ZuWfptfDQte2_PfC7rXvMQ] update_mapping [_doc]
[2021-05-08T10:39:43,085][INFO ][o.e.c.m.MetaDataMappingService] [elk1] [testlog-2021.05.08/ZuWfptfDQte2_PfC7rXvMQ] update_mapping [_doc]
[2021-05-08T10:39:43,088][INFO ][o.e.c.m.MetaDataMappingService] [elk1] [testlog-2021.05.08/ZuWfptfDQte2_PfC7rXvMQ] update_mapping [_doc]
[2021-05-08T10:39:43,093][INFO ][o.e.c.m.MetaDataMappingService] [elk1] [testlog-2021.05.08/ZuWfptfDQte2_PfC7rXvMQ] update_mapping [_doc]
[2021-05-08T10:39:43,158][INFO ][o.e.c.m.MetaDataMappingService] [elk1] [testlog-2021.05.08/ZuWfptfDQte2_PfC7rXvMQ] update_mapping [_doc]
[2021-05-08T10:39:43,161][INFO ][o.e.c.m.MetaDataMappingService] [elk1] [testlog-2021.05.08/ZuWfptfDQte2_PfC7rXvMQ] update_mapping [_doc]
[2021-05-08T10:39:43,218][INFO ][o.e.c.m.MetaDataMappingService] [elk1] [testlog-2021.05.08/ZuWfptfDQte2_PfC7rXvMQ] update_mapping [_doc]
[2021-05-08T10:39:43,222][INFO ][o.e.c.m.MetaDataMappingService] [elk1] [testlog-2021.05.08/ZuWfptfDQte2_PfC7rXvMQ] update_mapping [_doc]
[2021-05-08T10:39:43,278][INFO ][o.e.c.m.MetaDataMappingService] [elk1] [testlog-2021.05.08/ZuWfptfDQte2_PfC7rXvMQ] update_mapping [_doc]
[2021-05-08T10:39:43,359][INFO ][o.e.c.m.MetaDataMappingService] [elk1] [testlog-2021.05.08/ZuWfptfDQte2_PfC7rXvMQ] update_mapping [_doc]
[2021-05-08T10:39:43,420][INFO ][o.e.c.m.MetaDataMappingService] [elk1] [testlog-2021.05.08/ZuWfptfDQte2_PfC7rXvMQ] update_mapping [_doc]
[2021-05-08T10:40:51,259][INFO ][o.e.c.m.MetaDataMappingService] [elk1] [testlog-2021.05.08/ZuWfptfDQte2_PfC7rXvMQ] update_mapping [_doc]
[2021-05-08T10:40:51,324][INFO ][o.e.c.m.MetaDataMappingService] [elk1] [testlog-2021.05.08/ZuWfptfDQte2_PfC7rXvMQ] update_mapping [_doc]
[2021-05-08T10:40:51,326][INFO ][o.e.c.m.MetaDataMappingService] [elk1] [testlog-2021.05.08/ZuWfptfDQte2_PfC7rXvMQ] update_mapping [_doc]
[2021-05-08T10:40:51,329][INFO ][o.e.c.m.MetaDataMappingService] [elk1] [testlog-2021.05.08/ZuWfptfDQte2_PfC7rXvMQ] update_mapping [_doc]
[2021-05-08T10:40:51,415][INFO ][o.e.c.m.MetaDataMappingService] [elk1] [testlog-2021.05.08/ZuWfptfDQte2_PfC7rXvMQ] update_mapping [_doc]
[2021-05-08T10:42:07,716][INFO ][o.e.c.m.MetaDataMappingService] [elk1] [testlog-2021.05.08/ZuWfptfDQte2_PfC7rXvMQ] update_mapping [_doc]

Node data (which does not get unassigned shards):

[2021-05-08T02:29:47,909][INFO ][o.e.m.j.JvmGcMonitorService] [elk2] [gc][218203] overhead, spent [265ms] collecting in the last [1s]
[2021-05-08T05:47:46,708][INFO ][o.e.m.j.JvmGcMonitorService] [elk2] [gc][230070] overhead, spent [347ms] collecting in the last [1s]
[2021-05-08T09:10:14,470][INFO ][o.e.m.j.JvmGcMonitorService] [elk2] [gc][242205] overhead, spent [307ms] collecting in the last [1s]
[2021-05-08T10:53:41,976][INFO ][o.e.m.j.JvmGcMonitorService] [elk2] [gc][248406] overhead, spent [340ms] collecting in the last [1s]

Farid_N · May 8, 2021, 8:12am

Why my primary and replica shards are not equal? Is this OK?

Farid_N · May 8, 2021, 8:36am

It seems that I have to DELETE some of my non replica shards...

system · June 5, 2021, 9:42am

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
My kibana has nothing but i think is elasticsearch problem Elasticsearch	11	2117	August 21, 2018
Elasticsearch hangs at 541 shards Elasticsearch	18	2764	July 5, 2017
New index immediately becomes red Elasticsearch	8	2071	July 6, 2017
Cluster management: 2000+ open active shards Elasticsearch elastic-stack-monitoring	31	2173	May 31, 2021
Elasticsearch Keeps Crashing shards and data to big Elasticsearch	18	3028	May 22, 2020

Kibana Monitoring Issue and Sharding Problem

Related topics