Shards Failed | Most of the recent indexes are unassigned

Sundaramoorthy_Anand · October 17, 2019, 7:39am

I have a server with ELK along with heartbeat installed. (all are v6.4.2)

Heartbeat is monitoring two other servers' elasticsearch and logstash with its pipeline in same domainfrom current server.
It is creating index like heartbeat-6.4.2-<date> everyday. I created dashboard for those two servers and was working as expected.

It was working fine until 12th Oct.

Yesterday I tried to see the dashboard for last 24 hrs and its giving me the following error.

10 of 62 Shards failed

Then I tried to check the health of my Elasticsearch:

// 20191017122950
// http://<hostname>:9202/_cluster/health

{
  "cluster_name": "elasticsearch",
  "status": "yellow",
  "timed_out": false,
  "number_of_nodes": 1,
  "number_of_data_nodes": 1,
  "active_primary_shards": 195,
  "active_shards": 195,
  "relocating_shards": 0,
  "initializing_shards": 0,
 "unassigned_shards": 194,
  "delayed_unassigned_shards": 0,
  "number_of_pending_tasks": 0,
  "number_of_in_flight_fetch": 0,
  "task_max_waiting_in_queue_millis": 0,
  "active_shards_percent_as_number": 50.128534704370175
}

Unassigned Shards count is 194

I don't know the exact reason and started digging deeper.

curl -XGET http://hostname:9202/_cluster/allocation/explain?pretty

{
  "index" : "mdcp-logs-2019.09.28",
  "shard" : 2,
  "primary" : false,
  "current_state" : "unassigned",
  "unassigned_info" : {
    "reason" : "CLUSTER_RECOVERED",
    "at" : "2019-10-01T11:30:03.909Z",
    "last_allocation_status" : "no_attempt"
  },
  "can_allocate" : "no",
  "allocate_explanation" : "cannot allocate because allocation is not permitted to any of the nodes",
  "node_allocation_decisions" : [
    {
      "node_id" : "Bewr4jriQziexcfUXZSfdg",
      "node_name" : "Bewr4jr",
      "transport_address" : "ip:9300",
      "node_attributes" : {
        "ml.machine_memory" : "67368890368",
        "xpack.installed" : "true",
        "ml.max_open_jobs" : "20",
        "ml.enabled" : "true"
      },
      "node_decision" : "no",
      "deciders" : [
        {
          "decider" : "same_shard",
          "decision" : "NO",
          "explanation" : "the shard cannot be allocated to the same node on which a copy of the shard already exists [[mdcp-logs-2019.09.28][2], node[Bewr4jriQziexcfUXZSfdg], [P], s[STARTED], a[id=SbtMhgHzTMCQv6yEGcwV8Q]]"
        }
      ]
    }
  ]
}

I can see the current data in Discover tab but not storing in index and so dashboard is not working.

Any solution or workaround for this?

Armin_Braun · October 17, 2019, 11:24am

Hi @Sundaramoorthy_Anand

the problem you are experiencing is a result of only having a single node running while having those indices configured to use one replica per shard.
Since a replicate and primary have to be on separate nodes, your replicas can not be allocated and you're seeing a yellow state for those indices. You can fix the issue by configuring the indices that are yellow to use 0 replicas to get green indices with just a single node cluster.

This should not however preclude indexing of new data into those indices. To find out what's preventing new data from being indexed I would suggest looking into the logs for the ES and/or Logstash nodes that are trying to write their monitoring data to your node to see what errors they are experiencing when indexing.

Sundaramoorthy_Anand · October 18, 2019, 8:47am

Thanks !
I made replicas to zero and now I can see number of unallocated shards become zero now. But still I'm seeing 10 of 62 Shards failed error

Armin_Braun · October 18, 2019, 9:31am

@Sundaramoorthy_Anand no problem.

The shard failures are the result of whatever query Kibana sends to your ES nodes failing on some shards I think. Do your ES logs or the Kibana logs show any errors/warnings when you see that failure in Kibana?

Sundaramoorthy_Anand · October 21, 2019, 7:14am

Elasticsearch log:

 [2019-10-21T07:08:33,255][DEBUG][o.e.a.s.TransportSearchAction] [Bewr4jr] [heartbeat-6.4.2-2019.10.19][0], node[Bewr4jriQziexcfUXZSfdg], [P], s[STARTED], a[id=4OVH7mT5Qp-BCMHM-zRBUQ]: Failed to execute [SearchRequest{searchType=QUERY_THEN_FETCH, indices=[heartbeat-6.4.2*], indicesOptions=IndicesOptions[ignore_unavailable=true, allow_no_indices=true, expand_wildcards_open=true, expand_wildcards_closed=false, allow_aliases_to_multiple_indices=true, forbid_closed_indices=true, ignore_aliases=false], types=[], routing='null', preference='1571641708008', requestCache=null, scroll=null, maxConcurrentShardRequests=5, batchedReduceSize=512, preFilterShardSize=32, allowPartialSearchResults=true, source={"size":0,"query":{"bool":{"must":[{"range":{"@timestamp":{"from":1571509800000,"to":1572114599999,"include_lower":true,"include_upper":true,"format":"epoch_millis","boost":1.0}}},{"match_phrase":{"http.url.raw":{"query":"http://<url>:9202/","slop":0,"zero_terms_query":"NONE","boost":1.0}}}],"filter":[{"match_all":{"boost":1.0}},{"match_all":{"boost":1.0}}],"adjust_pure_negative":true,"boost":1.0}},"_source":{"includes":[],"excludes":[]},"stored_fields":"*","docvalue_fields":[{"field":"@timestamp","format":"date_time"}],"script_fields":{},"aggregations":{"2":{"terms":{"field":"monitor.status","size":5,"min_doc_count":1,"shard_min_doc_count":0,"show_term_doc_count_error":false,"order":[{"_count":"desc"},{"_key":"asc"}]}}}}}] lastShard [true]
        org.elasticsearch.transport.RemoteTransportException: [Bewr4jr][<ip>:9300][indices:data/read/search[phase/query]]
        Caused by: java.lang.IllegalArgumentException: Fielddata is disabled on text fields by default. Set fielddata=true on [monitor.status] in order to load fielddata in memory by uninverting the inverted index. Note that this can however use significant memory. Alternatively use a keyword field instead.
                at org.elasticsearch.index.mapper.TextFieldMapper$TextFieldType.fielddataBuilder(TextFieldMapper.java:670) ~[elasticsearch-6.4.2.jar:6.4.2]
         at org.elasticsearch.index.fielddata.IndexFieldDataService.getForField(IndexFieldDataService.java:115) ~[elasticsearch-6.4.2.jar:6.4.2]
            at org.elasticsearch.index.query.QueryShardContext.getForField(QueryShardContext.java:166) ~[elasticsearch-6.4.2.jar:6.4.2]
            at org.elasticsearch.search.aggregations.support.ValuesSourceConfig.resolve(ValuesSourceConfig.java:94) ~[elasticsearch-6.4.2.jar:6.4.2]
            at org.elasticsearch.search.aggregations.support.ValuesSourceAggregationBuilder.resolveConfig(ValuesSourceAggregationBuilder.java:317) ~[elasticsearch-6.4.2.jar:6.4.2]
            at org.elasticsearch.search.aggregations.support.ValuesSourceAggregationBuilder.doBuild(ValuesSourceAggregationBuilder.java:310) ~[elasticsearch-6.4.2.jar:6.4.2]
            at org.elasticsearch.search.aggregations.support.ValuesSourceAggregationBuilder.doBuild(ValuesSourceAggregationBuilder.java:37) ~[elasticsearch-6.4.2.jar:6.4.2]
            at org.elasticsearch.search.aggregations.AbstractAggregationBuilder.build(AbstractAggregationBuilder.java:139) ~[elasticsearch-6.4.2.jar:6.4.2]
            at org.elasticsearch.search.aggregations.AggregatorFactories$Builder.build(AggregatorFactories.java:329) ~[elasticsearch-6.4.2.jar:6.4.2]
            at org.elasticsearch.search.SearchService.parseSource(SearchService.java:766) ~[elasticsearch-6.4.2.jar:6.4.2]
            at org.elasticsearch.search.SearchService.createContext(SearchService.java:575) ~[elasticsearch-6.4.2.jar:6.4.2]
            at org.elasticsearch.search.SearchService.createAndPutContext(SearchService.java:551) ~[elasticsearch-6.4.2.jar:6.4.2]
            at org.elasticsearch.search.SearchService.executeQueryPhase(SearchService.java:347) ~[elasticsearch-6.4.2.jar:6.4.2]
            at org.elasticsearch.search.SearchService$2.onResponse(SearchService.java:333) [elasticsearch-6.4.2.jar:6.4.2]
            at org.elasticsearch.search.SearchService$2.onResponse(SearchService.java:329) [elasticsearch-6.4.2.jar:6.4.2]
            at org.elasticsearch.search.SearchService$3.doRun(SearchService.java:1019) [elasticsearch-6.4.2.jar:6.4.2]
            at org.elasticsearch.common.util.concurrent.ThreadContext$ContextPreservingAbstractRunnable.doRun(ThreadContext.java:723) [elasticsearch-6.4.2.jar:6.4.2]
            at org.elasticsearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:37) [elasticsearch-6.4.2.jar:6.4.2]
            at org.elasticsearch.common.util.concurrent.TimedRunnable.doRun(TimedRunnable.java:41) [elasticsearch-6.4.2.jar:6.4.2]
            at org.elasticsearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:37) [elasticsearch-6.4.2.jar:6.4.2]
            at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) [?:1.8.0_201]
            at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) [?:1.8.0_201]
            at java.lang.Thread.run(Thread.java:748) [?:1.8.0_201]

Armin_Braun · October 21, 2019, 7:32am

@Sundaramoorthy_Anand

Looks like something went wrong with your index-templates/mappings there. Since you're saying it worked fine before Oct 12th, I'm assuming you did follow all the steps in https://www.elastic.co/guide/en/beats/heartbeat/6.4/load-kibana-dashboards.html to configure things correctly? Did anything change that may have broken your configured templates (like e.g. starting to use LS but not configuring things accordingly)?

Sundaramoorthy_Anand · October 21, 2019, 8:49am

heartbeat setup -e \
      -E output.logstash.enabled=false \
      -E output.elasticsearch.hosts=['localhost:9200'] \
      -E output.elasticsearch.username=heartbeat_internal \
      -E output.elasticsearch.password=YOUR_PASSWORD \
      -E setup.kibana.host=localhost:5601

Instead of doing this in command line, I did the same through heartbeat.yml

# Configure monitors
heartbeat.monitors:
- type: http

  # List or urls to query
  urls: [  <url list>  ]

  # Configure task schedule
  schedule: '@every 10s'
  timeout: 16s
#==================== Elasticsearch template setting ==========================

setup.template.settings:
  index.number_of_shards: 1
  index.codec: best_compression

setup.kibana:

  # Kibana Host
  host: "ip:5601"
#-------------------------- Elasticsearch output ------------------------------
output.elasticsearch:
  # Array of hosts to connect to.
  hosts: ["ip:9202"]

The same file was working until Oct 12th and still it pushes the data to Discover tab. But it is not getting assigned to any index.

Sundaramoorthy_Anand · October 22, 2019, 9:11am

@Armin_Braun Is there any update on this? Thanks in advance.

sandeepkanabar · October 23, 2019, 7:23am

What does GET _cat/allocation?v return?. Run this from Dev Tools in Kibana.

Sundaramoorthy_Anand · October 23, 2019, 9:15am

@sandeepkanabar This is what I get

After executing the PUT API for settings(what is in above pic), I got this:

sandeepkanabar · October 23, 2019, 10:09am

Trying restarting the ES process in the other node and see if it joins the ES cluster properly. If it joins, then you can set the replicas back to 1.

There's nothing like pushing the data to Discover tab. Discover tab merely reads the data from heartbeat index in this case.

Sundaramoorthy_Anand · October 23, 2019, 10:12am

There is NO master slave nodes here.

Only one node present in our architecture. and so I made the replicas to ZERO as @Armin_Braun suggested

Armin_Braun · October 28, 2019, 12:47pm

@Sundaramoorthy_Anand sorry for the delay here. Unfortunately, I'm at my wits end when it comes to this one. I would suggest asking for help in the Beats forums, it seems like this is rather a Beats than an ES issue at this point (the mappings simply don't fit with the queries that are being set up which isn't a problem with ES itself) I'm afraid.

Sundaramoorthy_Anand · October 29, 2019, 12:07pm

okay @Armin_Braun

system · November 26, 2019, 12:07pm

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Unassigned Shards Problem Elasticsearch	5	622	July 6, 2017
Oops! SearchPhaseExecutionException[Failed to execute phase [query], all shards failed] Elasticsearch	5	2506	July 6, 2017
Red status unassigned shards help Elasticsearch	8	565	July 6, 2017
Slow Shard Assignment Elasticsearch	6	1810	July 6, 2017
All new indexes created have unassigned shards Elasticsearch	10	7887	July 5, 2017

Shards Failed | Most of the recent indexes are unassigned

Related topics