Elasticsearch error all shards failed on single node

yc99 · February 17, 2023, 6:35am

Caused by: org.elasticsearch.action.NoShardAvailableActionException: [ip-13-35-23-200.ap-1.compute.internal][13.35.23.200:9300][indices:data/read/search[phase/query]]
[2023-02-17T08:14:15,934][WARN ][r.suppressed             ] [ip-13-35-23-200.ap-1.compute.internal] path: /.kibana_task_manager/_update_by_query, params: {ignore_unavailable=true, refresh=true, index=.kibana_task_manager}
org.elasticsearch.action.search.SearchPhaseExecutionException: all shards failed
	at org.elasticsearch.action.search.AbstractSearchAsyncAction.onPhaseFailure(AbstractSearchAsyncAction.java:728) ~[elasticsearch-8.6.1.jar:?]
	at org.elasticsearch.action.search.AbstractSearchAsyncAction.executeNextPhase(AbstractSearchAsyncAction.java:418) ~[elasticsearch-8.6.1.jar:?]
	at org.elasticsearch.action.search.AbstractSearchAsyncAction.onPhaseDone(AbstractSearchAsyncAction.java:760) ~[elasticsearch-8.6.1.jar:?]
	at org.elasticsearch.action.search.AbstractSearchAsyncAction.onShardFailure(AbstractSearchAsyncAction.java:512) ~[elasticsearch-8.6.1.jar:?]
	at org.elasticsearch.action.search.AbstractSearchAsyncAction$1.onFailure(AbstractSearchAsyncAction.java:349) ~[elasticsearch-8.6.1.jar:?]
	at org.elasticsearch.action.ActionListener$Delegating.onFailure(ActionListener.java:92) ~[elasticsearch-8.6.1.jar:?]
	at org.elasticsearch.action.ActionListenerResponseHandler.handleException(ActionListenerResponseHandler.java:48) ~[elasticsearch-8.6.1.jar:?]
	at org.elasticsearch.action.search.SearchTransportService$ConnectionCountingHandler.handleException(SearchTransportService.java:642) ~[elasticsearch-8.6.1.jar:?]
	at org.elasticsearch.transport.TransportService$UnregisterChildTransportResponseHandler.handleException(TransportService.java:1646) ~[elasticsearch-8.6.1.jar:?]
	at org.elasticsearch.transport.TransportService$ContextRestoreResponseHandler.handleException(TransportService.java:1372) ~[elasticsearch-8.6.1.jar:?]
	at org.elasticsearch.transport.TransportService$DirectResponseChannel.processException(TransportService.java:1508) ~[elasticsearch-8.6.1.jar:?]
	at org.elasticsearch.transport.TransportService$DirectResponseChannel.sendResponse(TransportService.java:1483) ~[elasticsearch-8.6.1.jar:?]
	at org.elasticsearch.transport.TaskTransportChannel.sendResponse(TaskTransportChannel.java:50) ~[elasticsearch-8.6.1.jar:?]
	at org.elasticsearch.action.support.ChannelActionListener.onFailure(ChannelActionListener.java:48) ~[elasticsearch-8.6.1.jar:?]
	at org.elasticsearch.action.ActionRunnable.onFailure(ActionRunnable.java:92) ~[elasticsearch-8.6.1.jar:?]
	at org.elasticsearch.common.util.concurrent.ThreadContext$ContextPreservingAbstractRunnable.onFailure(ThreadContext.java:900) ~[elasticsearch-8.6.1.jar:?]
	at org.elasticsearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:28) ~[elasticsearch-8.6.1.jar:?]
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1144) ~[?:?]
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:642) ~[?:?]
	at java.lang.Thread.run(Thread.java:1589) ~[?:?]
Caused by: org.elasticsearch.action.NoShardAvailableActionException: [ip-13-35-23-200.ap-1.compute.internal][13.35.23.200:9300][indices:data/read/search[phase/query]]
[2023-02-17T08:14:15,946][WARN ][r.suppressed             ] [ip-13-35-23-200.ap-1.compute.internal] path: /.kibana_task_manager/_update_by_query, params: {ignore_unavailable=true, refresh=true, index=.kibana_task_manager}

org.elasticsearch.action.search.SearchPhaseExecutionException: all shards failed
	at org.elasticsearch.action.search.AbstractSearchAsyncAction.onPhaseFailure(AbstractSearchAsyncAction.java:728) ~[elasticsearch-8.6.1.jar:?]

I noticed that the elasticsearch service went down unexpectedly, I saw the log said the service went down because of all shards failed, but problem can be resolved by simply restart the service, I set the repliacas to 0 for every index and 5 shards to every index but from time to time, the elasticsearch still went down, after restarted the service, there is unassigned_shards/delayed_unassigned_shards, all index health is green as well, anyone have idea what happened here? Besides the service went down unexpected sometime, everything's fine

warkolm · February 20, 2023, 2:28am

What is the output from the _cluster/stats?pretty&human API?

What do your full Elasticsearch logs show?

yc99 · February 20, 2023, 2:51am

{
  "error" : {
    "root_cause" : [
      {
        "type" : "illegal_argument_exception",
        "reason" : "request [/_cluster/stats] contains unrecognized parameter: [human API] -> did you mean [human]?"
      }
    ],
    "type" : "illegal_argument_exception",
    "reason" : "request [/_cluster/stats] contains unrecognized parameter: [human API] -> did you mean [human]?"
  },
  "status" : 400
}

Here is the output from the [_cluster/stats?pretty&human API]

and the full log is basically the same, keep on prompt all shards failed, after restarted elasticsearch service, the log become following

[2023-02-17T11:40:01,993][INFO ][o.e.c.r.a.AllocationService] [ip-13-35-23-200.ap-1.compute.internal] current.health="GREEN" message="Cluster health status changed from [RED] to [GREEN] (reason: [shards started [[.async-search][0]]])." previous.health="RED" reason="shards started [[.async-search][0]]"
[2023-02-17T12:36:05,925][INFO ][o.e.c.r.a.AllocationService] [ip-13-35-23-200.ap-1.compute.internal] current.health="GREEN" message="Cluster health status changed from [RED] to [GREEN] (reason: [shards started [[taccess-log2023.02.15][0]]])." previous.health="RED" reason="shards started [[access-log2023.02.15][0]]"
[2023-02-17T12:42:38,585][INFO ][o.e.c.r.a.AllocationService] [ip-13-35-23-200.ap-1.compute.internal] current.health="GREEN" message="Cluster health status changed from [RED] to [GREEN] (reason: [shards started [[.kibana-event-log-8.6.1-000001][0]]])." previous.health="RED" reason="shards started [[.kibana-event-log-8.6.1-000001][0]]"
[2023-02-17T14:19:44,107][INFO ][o.e.c.r.a.AllocationService] [ip-13-35-23-200.ap-1.compute.internal] current.health="GREEN" message="Cluster health status changed from [RED] to [GREEN] (reason: [shards started [[access-log2023.02.15][0], [.ds-.logs-deprecation.elasticsearch-default-2023.02.03-000001][0]]])." previous.health="RED" reason="shards started [[access-log2023.02.15][0], [.ds-.logs-deprecation.elasticsearch-default-2023.02.03-000001][0]]"

warkolm · February 20, 2023, 2:51am

Can you share the full request that you made?

yc99 · February 20, 2023, 2:55am

full request mean like the following?

http://13.35.23.200:9200/_cluster/stats?pretty&human%20API?```

warkolm · February 20, 2023, 3:03am

This is not part of the API endpoint, so remove that and try again.

yc99 · February 20, 2023, 3:19am

here is output ya

{
  "_nodes" : {
    "total" : 1,
    "successful" : 1,
    "failed" : 0
  },
  "cluster_name" : "elasticsearch",
  "cluster_uuid" : "go_UsMq7TaqQjTQaKklOzA",
  "timestamp" : 1676863027604,
  "status" : "green",
  "indices" : {
    "count" : 17,
    "shards" : {
      "total" : 49,
      "primaries" : 49,
      "replication" : 0.0,
      "index" : {
        "shards" : {
          "min" : 1,
          "max" : 5,
          "avg" : 2.8823529411764706
        },
        "primaries" : {
          "min" : 1,
          "max" : 5,
          "avg" : 2.8823529411764706
        },
        "replication" : {
          "min" : 0.0,
          "max" : 0.0,
          "avg" : 0.0
        }
      }
    },
    "docs" : {
      "count" : 20995,
      "deleted" : 83559
    },
    "store" : {
      "size" : "77mb",
      "size_in_bytes" : 80778547,
      "total_data_set_size" : "77mb",
      "total_data_set_size_in_bytes" : 80778547,
      "reserved" : "0b",
      "reserved_in_bytes" : 0
    },
    "fielddata" : {
      "memory_size" : "5.6kb",
      "memory_size_in_bytes" : 5736,
      "evictions" : 0
    },
    "query_cache" : {
      "memory_size" : "4.7mb",
      "memory_size_in_bytes" : 5015170,
      "total_count" : 462653,
      "hit_count" : 69793,
      "miss_count" : 392860,
      "cache_size" : 26,
      "cache_count" : 47,
      "evictions" : 21
    },
    "completion" : {
      "size" : "0b",
      "size_in_bytes" : 0
    },
    "segments" : {
      "count" : 302,
      "memory" : "0b",
      "memory_in_bytes" : 0,
      "terms_memory" : "0b",
      "terms_memory_in_bytes" : 0,
      "stored_fields_memory" : "0b",
      "stored_fields_memory_in_bytes" : 0,
      "term_vectors_memory" : "0b",
      "term_vectors_memory_in_bytes" : 0,
      "norms_memory" : "0b",
      "norms_memory_in_bytes" : 0,
      "points_memory" : "0b",
      "points_memory_in_bytes" : 0,
      "doc_values_memory" : "0b",
      "doc_values_memory_in_bytes" : 0,
      "index_writer_memory" : "8mb",
      "index_writer_memory_in_bytes" : 8417248,
      "version_map_memory" : "11.2kb",
      "version_map_memory_in_bytes" : 11507,
      "fixed_bit_set" : "14.1kb",
      "fixed_bit_set_memory_in_bytes" : 14472,
      "max_unsafe_auto_id_timestamp" : 1676828438225,
      "file_sizes" : { }
    },
    "mappings" : {
      "total_field_count" : 1090,
      "total_deduplicated_field_count" : 762,
      "total_deduplicated_mapping_size" : "14.6kb",
      "total_deduplicated_mapping_size_in_bytes" : 15020,
      "field_types" : [
        {
          "name" : "boolean",
          "count" : 14,
          "index_count" : 10,
          "script_count" : 0
        },
        {
          "name" : "constant_keyword",
          "count" : 3,
          "index_count" : 1,
          "script_count" : 0
        },
        {
          "name" : "date",
          "count" : 29,
          "index_count" : 11,
          "script_count" : 0
        },
        {
          "name" : "float",
          "count" : 12,
          "index_count" : 2,
          "script_count" : 0
        },
        {
          "name" : "half_float",
          "count" : 8,
          "index_count" : 2,
          "script_count" : 0
        },
        {
          "name" : "integer",
          "count" : 22,
          "index_count" : 1,
          "script_count" : 0
        },
        {
          "name" : "ip",
          "count" : 1,
          "index_count" : 1,
          "script_count" : 0
        },
        {
          "name" : "keyword",
          "count" : 338,
          "index_count" : 11,
          "script_count" : 0
        },
        {
          "name" : "long",
          "count" : 235,
          "index_count" : 10,
          "script_count" : 0
        },
        {
          "name" : "nested",
          "count" : 4,
          "index_count" : 2,
          "script_count" : 0
        },
        {
          "name" : "object",
          "count" : 208,
          "index_count" : 11,
          "script_count" : 0
        },
        {
          "name" : "text",
          "count" : 215,
          "index_count" : 10,
          "script_count" : 0
        },
        {
          "name" : "version",
          "count" : 1,
          "index_count" : 1,
          "script_count" : 0
        }
      ],
      "runtime_field_types" : [ ]
    },
    "analysis" : {
      "char_filter_types" : [ ],
      "tokenizer_types" : [ ],
      "filter_types" : [ ],
      "analyzer_types" : [ ],
      "built_in_char_filters" : [ ],
      "built_in_tokenizers" : [ ],
      "built_in_filters" : [ ],
      "built_in_analyzers" : [ ]
    },
    "versions" : [
      {
        "version" : "8.6.1",
        "index_count" : 17,
        "primary_shard_count" : 49,
        "total_primary_size" : "77mb",
        "total_primary_bytes" : 80778547
      }
    ],
    "search" : {
      "total" : 207077,
      "queries" : {
        "match_phrase" : 16,
        "bool" : 206691,
        "terms" : 33742,
        "prefix" : 1,
        "match" : 22540,
        "match_phrase_prefix" : 4,
        "match_all" : 1,
        "exists" : 30041,
        "range" : 165064,
        "term" : 200451,
        "nested" : 1,
        "simple_query_string" : 1160
      },
      "sections" : {
        "highlight" : 7,
        "stored_fields" : 14,
        "runtime_mappings" : 1,
        "query" : 206820,
        "script_fields" : 14,
        "terminate_after" : 6,
        "_source" : 105,
        "pit" : 211,
        "fields" : 14,
        "collapse" : 11120,
        "aggs" : 152768
      }
    }
  },
  "nodes" : {
    "count" : {
      "total" : 1,
      "coordinating_only" : 0,
      "data" : 1,
      "data_cold" : 1,
      "data_content" : 1,
      "data_frozen" : 1,
      "data_hot" : 1,
      "data_warm" : 1,
      "index" : 0,
      "ingest" : 1,
      "master" : 1,
      "ml" : 1,
      "remote_cluster_client" : 1,
      "search" : 0,
      "transform" : 1,
      "voting_only" : 0
    },
    "versions" : [
      "8.6.1"
    ],
    "os" : {
      "available_processors" : 2,
      "allocated_processors" : 2,
      "names" : [
        {
          "name" : "Linux",
          "count" : 1
        }
      ],
      "pretty_names" : [
        {
          "pretty_name" : "Amazon Linux 2",
          "count" : 1
        }
      ],
      "architectures" : [
        {
          "arch" : "amd64",
          "count" : 1
        }
      ],
      "mem" : {
        "total" : "3.6gb",
        "total_in_bytes" : 3891253248,
        "adjusted_total" : "3.6gb",
        "adjusted_total_in_bytes" : 3891253248,
        "free" : "108.7mb",
        "free_in_bytes" : 114020352,
        "used" : "3.5gb",
        "used_in_bytes" : 3777232896,
        "free_percent" : 3,
        "used_percent" : 97
      }
    },
    "process" : {
      "cpu" : {
        "percent" : 1
      },
      "open_file_descriptors" : {
        "min" : 856,
        "max" : 856,
        "avg" : 856
      }
    },
    "jvm" : {
      "max_uptime" : "9.6h",
      "max_uptime_in_millis" : 34613397,
      "versions" : [
        {
          "version" : "19.0.1",
          "vm_name" : "OpenJDK 64-Bit Server VM",
          "vm_version" : "19.0.1+10-21",
          "vm_vendor" : "Oracle Corporation",
          "bundled_jdk" : true,
          "using_bundled_jdk" : true,
          "count" : 1
        }
      ],
      "mem" : {
        "heap_used" : "649.3mb",
        "heap_used_in_bytes" : 680940640,
        "heap_max" : "1gb",
        "heap_max_in_bytes" : 1073741824
      },
      "threads" : 55
    },
    "fs" : {
      "total" : "99.9gb",
      "total_in_bytes" : 107361579008,
      "free" : "91.2gb",
      "free_in_bytes" : 97936973824,
      "available" : "91.2gb",
      "available_in_bytes" : 97936973824
    },
    "plugins" : [ ],
    "network_types" : {
      "transport_types" : {
        "netty4" : 1
      },
      "http_types" : {
        "netty4" : 1
      }
    },
    "discovery_types" : {
      "multi-node" : 1
    },
    "packaging_types" : [
      {
        "flavor" : "default",
        "type" : "rpm",
        "count" : 1
      }
    ],
    "ingest" : {
      "number_of_pipelines" : 0,
      "processor_stats" : { }
    },
    "indexing_pressure" : {
      "memory" : {
        "current" : {
          "combined_coordinating_and_primary" : "0b",
          "combined_coordinating_and_primary_in_bytes" : 0,
          "coordinating" : "0b",
          "coordinating_in_bytes" : 0,
          "primary" : "0b",
          "primary_in_bytes" : 0,
          "replica" : "0b",
          "replica_in_bytes" : 0,
          "all" : "0b",
          "all_in_bytes" : 0
        },
        "total" : {
          "combined_coordinating_and_primary" : "0b",
          "combined_coordinating_and_primary_in_bytes" : 0,
          "coordinating" : "0b",
          "coordinating_in_bytes" : 0,
          "primary" : "0b",
          "primary_in_bytes" : 0,
          "replica" : "0b",
          "replica_in_bytes" : 0,
          "all" : "0b",
          "all_in_bytes" : 0,
          "coordinating_rejections" : 0,
          "primary_rejections" : 0,
          "replica_rejections" : 0
        },
        "limit" : "0b",
        "limit_in_bytes" : 0
      }
    }
  }
}```

system · March 20, 2023, 3:19am

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Elasticsearch all shards failed Elasticsearch	2	519	July 5, 2021
Service unavailable error code 503 all shard failed Elasticsearch	12	2242	June 21, 2023
All shards failed across multiple indexes Elasticsearch	6	602	June 2, 2022
All shards failed-503error Elasticsearch	1	343	October 15, 2019
org.elasticsearch.action.search.SearchPhaseExecutionException: all shards failed Elasticsearch	2	1406	August 12, 2021

Elasticsearch error all shards failed on single node

Related topics