Elasticsearch failed Search rejected due to missing shards [[.kibana_task_manager_7.17.7_001][0]]

Hello,

Current Conf -
Version - Elasticsearch| Kibana - 7.17.3
2 Node Cluster

Recently i am facing lot of trouble to keep the cluster in healthy state.
The error which i am facing is -

Caused by: org.elasticsearch.action.search.SearchPhaseExecutionException: Search rejected due to missing shards [[.kibana_task_manager_7.17.7_001][0]]. Consider using `allow_partial_search_results` setting to bypass this error.
        at org.elasticsearch.action.search.AbstractSearchAsyncAction.run(AbstractSearchAsyncAction.java:227) ~[elasticsearch-7.17.9.jar:7.17.9]
        at org.elasticsearch.action.search.AbstractSearchAsyncAction.executePhase(AbstractSearchAsyncAction.java:454) [elasticsearch-7.17.9.jar:7.17.9]
        ... 267 more
This is my master conf file - 
#
cluster.name: monitoring
#
# ------------------------------------ Node ------------------------------------
#
# Use a descriptive name for the node:
#
node.name: ci-sh-mgmt-mon01
node.master: true
node.data: true
#
# Add custom attributes to the node:
#
#node.attr.rack: r1
#
# ----------------------------------- Paths ------------------------------------
#
# Path to directory where to store the data (separate multiple locations by comma):
#
#path.data: /path/to/data
path.data: /mon/elasticsearch
#
# Path to log files:
#
#path.logs: /path/to/logs
path.logs: /mon/elasticsearch-logs
path.repo: /mon/snapshots
#
# ----------------------------------- Memory -----------------------------------
#
# Lock the memory on startup:
#
#bootstrap.memory_lock: true
#
# Make sure that the heap size is set to about half the memory available
# on the system and that the owner of the process is allowed to use this
# limit.
#
# Elasticsearch performs poorly when the system is swapping the memory.
#
# ---------------------------------- Network -----------------------------------
#
# By default Elasticsearch is only accessible on localhost. Set a different
# address here to expose this node on the network:
#
network.host: 10.135.0.4
#
# By default Elasticsearch listens for HTTP traffic on the first free port it
# finds starting at 9200. Set a specific HTTP port here:
#
http.port: 9200
#
# For more information, consult the network module documentation.
#
# --------------------------------- Discovery ----------------------------------
#
# Pass an initial list of hosts to perform discovery when this node is started:
# The default list of hosts is ["127.0.0.1", "[::1]"]
#
#discovery.seed_hosts: ["0.0.0.0"]
#discovery.seed_hosts: ["10.135.0.4"]
discovery.seed_hosts: ["10.135.0.4"]
#
# Bootstrap the cluster using an initial set of master-eligible nodes:
#
cluster.initial_master_nodes: ["ci-sh-mgmt-mon01"]


# For more information, consult the discovery and cluster formation module documentation.
#
# ---------------------------------- Various -----------------------------------
#
# Require explicit names when deleting indices:
#
#action.destructive_requires_name: true
#
# ---------------------------------- Security ----------------------------------
#
#                                 *** WARNING ***
#
# Elasticsearch security features are not enabled by default.
# These features are free, but require configuration changes to enable them.
# This means that users don’t have to provide credentials and can get full access
# to the cluster. Network connections are also not encrypted.
#
# To protect your data, we strongly encourage you to enable the Elasticsearch security features.
# Refer to the following documentation for instructions.
#
# https://www.elastic.co/guide/en/elasticsearch/reference/7.16/configuring-stack-security.html
#
xpack.security.enabled: true

xpack.security.transport.ssl.enabled: true
xpack.security.transport.ssl.verification_mode: certificate
xpack.security.transport.ssl.client_authentication: required
xpack.security.transport.ssl.keystore.path: elastic-stack-ca.p12
xpack.security.transport.ssl.truststore.path: elastic-stack-ca.p12


This is my Data node conf - 
cluster.name: monitoring
#
# ------------------------------------ Node ------------------------------------
#
# Use a descriptive name for the node:
#
node.name: ci-sh-data

node.roles: [ data, ingest, remote_cluster_client]

#
# Add custom attributes to the node:
#
#node.attr.rack: r1
#
# ----------------------------------- Paths ------------------------------------
#
# Path to directory where to store the data (separate multiple locations by comma):
#
path.data: /mon/elasticsearch-data1
#
# Path to log files:
#
path.logs: /mon/elasticsearch-data1/logs
path.repo: /mon/snapshots
#
# ----------------------------------- Memory -----------------------------------
#
# Lock the memory on startup:
#
#bootstrap.memory_lock: true
#
# Make sure that the heap size is set to about half the memory available
# on the system and that the owner of the process is allowed to use this
# limit.
#
# Elasticsearch performs poorly when the system is swapping the memory.
#
# ---------------------------------- Network -----------------------------------
#
# By default Elasticsearch is only accessible on localhost. Set a different
# address here to expose this node on the network:
#
network.host: 10.135.0.4
#
# By default Elasticsearch listens for HTTP traffic on the first free port it
# finds starting at 9200. Set a specific HTTP port here:
#
http.port: 9400
#
# For more information, consult the network module documentation.
#
# --------------------------------- Discovery ----------------------------------
#
# Pass an initial list of hosts to perform discovery when this node is started:
# The default list of hosts is ["127.0.0.1", "[::1]"]
#
discovery.seed_hosts: ["10.135.0.4"]
#
# Bootstrap the cluster using an initial set of master-eligible nodes:
#
cluster.initial_master_nodes: ["ci-sh-data"]
#
# For more information, consult the discovery and cluster formation module documentation.
#
# ---------------------------------- Various -----------------------------------
#
# Require explicit names when deleting indices:
#
#action.destructive_requires_name: true

xpack.security.enabled: true
xpack.security.transport.ssl.enabled: true
xpack.security.transport.ssl.verification_mode: certificate
xpack.security.transport.ssl.client_authentication: required
xpack.security.transport.ssl.keystore.path: elastic-stack-ca.p12
xpack.security.transport.ssl.truststore.path: elastic-stack-ca.p12

This error came for the first time also there are no memory or space issue since i have configured this. There are no restarts of nodes as well.

I have a simple conf nothing complex and using tls certificate for internal node communication.

Hi @johnashish
Welcome to the community.

This could happen due to various reasons, such as index corruption or data loss.

To address this issue, you can try the following steps

  • Check the status of the affected index using the Elasticsearch cluster health API. Ensure that all the shards for the index are in the "started" state.

  • Verify if the index exists and has data. If the index is not present or empty, you may need to reindex the data or restore it from a backup.

  • Check the Elasticsearch logs for any errors or warnings related to the index [.kibana_task_manager_7.17.7_001][0]. This might give you more insights into the root cause of the missing shards.

Hello @DineshNaik
Thanks for the checking out.

  • Only two index are in "Unassigned state", rest all are in Started state.
{
  "cluster_name" : "monitoring",
  "status" : "red",
  "timed_out" : false,
  "number_of_nodes" : 2,
  "number_of_data_nodes" : 2,
  "active_primary_shards" : 374,
  "active_shards" : 748,
  "relocating_shards" : 0,
  "initializing_shards" : 0,
  "unassigned_shards" : 2,
  "delayed_unassigned_shards" : 0,
  "number_of_pending_tasks" : 0,
  "number_of_in_flight_fetch" : 0,
  "task_max_waiting_in_queue_millis" : 0,
  "active_shards_percent_as_number" : 99.73333333333333
}

Here are the logs -

Caused by: org.elasticsearch.action.search.SearchPhaseExecutionException: Search rejected due to missing shards [[.kibana_task_manager_7.17.7_001][0]]. Consider using `allow_partial_search_results` setting to bypass this error.
        at org.elasticsearch.action.search.AbstractSearchAsyncAction.run(AbstractSearchAsyncAction.java:227) ~[elasticsearch-7.17.9.jar:7.17.9]
        at org.elasticsearch.action.search.AbstractSearchAsyncAction.executePhase(AbstractSearchAsyncAction.java:454) [elasticsearch-7.17.9.jar:7.17.9]
        ... 267 more
[2023-08-06T13:36:54,155][WARN ][r.suppressed             ] [ci-sh-mgmt-mon01] path: /.kibana_task_manager/_update_by_query, params: {ignore_unavailable=true, refresh=true, conflicts=proceed, index=.kibana_task_manager}
org.elasticsearch.action.search.SearchPhaseExecutionException:
        at org.elasticsearch.action.search.AbstractSearchAsyncAction.onPhaseFailure(AbstractSearchAsyncAction.java:713) [elasticsearch-7.17.9.jar:7.17.9]
        at org.elasticsearch.action.search.AbstractSearchAsyncAction.executePhase(AbstractSearchAsyncAction.java:459) [elasticsearch-7.17.9.jar:7.17.9]
        at org.elasticsearch.action.search.AbstractSearchAsyncAction.start(AbstractSearchAsyncAction.java:199) [elasticsearch-7.17.9.jar:7.17.9]
        at org.elasticsearch.action.search.TransportSearchAction.executeSearch(TransportSearchAction.java:1048) [elasticsearch-7.17.9.jar:7.17.9]
        at org.elasticsearch.action.search.TransportSearchAction.executeLocalSearch(TransportSearchAction.java:763) [elasticsearch-7.17.9.jar:7.17.9]
        at org.elasticsearch.action.search.TransportSearchAction.lambda$executeRequest$6(TransportSearchAction.java:399) [elasticsearch-7.17.9.jar:7.17.9]
        at org.elasticsearch.action.ActionListener$1.onResponse(ActionListener.java:136) [elasticsearch-7.17.9.jar:7.17.9]
        at org.elasticsearch.index.query.Rewriteable.rewriteAndFetch(Rewriteable.java:112) [elasticsearch-7.17.9.jar:7.17.9]
        at org.elasticsearch.index.query.Rewriteable.rewriteAndFetch(Rewriteable.java:77) [elasticsearch-7.17.9.jar:7.17.9]
        at org.elasticsearch.action.search.TransportSearchAction.executeRequest(TransportSearchAction.java:487) [elasticsearch-7.17.9.jar:7.17.9]
        at org.elasticsearch.action.search.TransportSearchAction.doExecute(TransportSearchAction.java:285) [elasticsearch-7.17.9.jar:7.17.9]
        at org.elasticsearch.action.search.TransportSearchAction.doExecute(TransportSearchAction.java:101) [elasticsearch-7.17.9.jar:7.17.9]
        at org.elasticsearch.action.support.TransportAction$RequestFilterChain.proceed(TransportAction.java:186) [elasticsearch-7.17.9.jar:7.17.9]
        at org.elasticsearch.action.support.ActionFilter$Simple.apply(ActionFilter.java:53) [elasticsearch-7.17.9.jar:7.17.9]
        at org.elasticsearch.action.support.TransportAction$RequestFilterChain.proceed(TransportAction.java:184) [elasticsearch-7.17.9.jar:7.17.9]
        at org.elasticsearch.xpack.security.action.filter.SecurityActionFilter.lambda$applyInternal$3(SecurityActionFilter.java:190) [x-pack-security-7.17.9.jar:7.17.9]
        at org.elasticsearch.action.ActionListen

Take a look at this

It'd be better to link directly to the section of the manual about troubleshooting missing shards since this contains the most complete and up-to-date information about this problem and what to do about it.

@johnashish Did you check the link shared by @DavidTurner ?

Yes @DineshNaik i did checked but now i am getting this error in kibana -
I had a backup of indexes which was not working, so manually i put the backup indices inside the data folder - (I have yet to configure snapshots and backup )

But now i am getting error in kibana -

{"type":"log","@timestamp":"2023-08-09T17:38:04+05:30","tags":["error","plugins","security","session"],"pid":3490,"message":"Failed to schedule session index cleanup task: Saved object index alias [.kibana_task_manager_7.17.7] not found: index_not_found_exception: [index_not_found_exception] Reason: no such index [.kibana_task_manager_7.17.7] and [require_alias] request flag is [true] and [.kibana_task_manager_7.17.7] is not an alias"}
{"type":"log","@timestamp":"2023-08-09T17:38:14+05:30","tags":["error","plugins","security","session"],"pid":3490,"message":"Failed to schedule session index cleanup task: Saved object index alias [.kibana_task_manager_7.17.7] not found: index_not_found_exception: [index_not_found_exception] Reason: no such index [.kibana_task_manager_7.17.7] and [require_alias] request flag is [true] and [.kibana_task_manager_7.17.7] is not an alias"}

Is there any way to create this this manually ?
Currently my ES Cluster is in green state

Quick Update -
After Kibana restarts, post restart functionality is working fine.
will monitor for next couple of days
Also enabled snapshot and Restore.