Dangling index and auto import to cluster state

Hi All,

I have an ELK cluster with 3 nodes. ELK version is 5.5.1. There were some connectivity issues between the servers and they couldn't ping each other for a minute or so. This led to log data or shards being out of sync between the nodes.

Then there was a reboot of all the 3 servers and when they all came up, one of the node kept throwing the below error:

[2017-12-19T16:39:55,783][WARN ][o.e.g.DanglingIndicesState] [sv-ocb3] [[filebeat-2017.12.07-000091/KFLrF95qTXSXvuZQ6WQLaQ]] can not be imported as a dangling index, as index with same name already exists in cluster metadata
[2017-12-19T16:39:56,463][WARN ][o.e.g.DanglingIndicesState] [sv-ocb3] [[filebeat-2017.12.07-000091/KFLrF95qTXSXvuZQ6WQLaQ]] can not be imported as a dangling index, as index with same name already exists in cluster metadata

After two days, I see an auto import being done as shown below.

Logs from node : sv-ocb3

[2017-12-21T16:00:02,616][WARN ][o.e.g.DanglingIndicesState] [sv-ocb3] [[filebeat-2017.12.07-000091/KFLrF95qTXSXvuZQ6WQLaQ]] can not be imported as a dangling index, as index with same name already exists in cluster metadata
[2017-12-21T16:00:02,734][INFO ][o.e.g.DanglingIndicesState] [sv-ocb3] [[filebeat-2017.12.07-000091/KFLrF95qTXSXvuZQ6WQLaQ]] dangling index exists on local file system, but not in cluster metadata, auto import to cluster state

Logs from node : sv-ocb2

[2017-12-21T16:00:02,829][INFO ][o.e.c.m.MetaDataMappingService] [sv-ocb2] [filebeat-2017.12.07-000138/eh8vAvnTTseaCkiBoPV6ZA] update_mapping [log]
[2017-12-21T16:00:02,854][INFO ][o.e.g.LocalAllocateDangledIndices] [sv-ocb2] auto importing dangled indices [[filebeat-2017.12.07-000091/KFLrF95qTXSXvuZQ6WQLaQ]/OPEN] from [{sv-ocb3}{4RtFLtGzSFuchKWgtE7lPQ}{UOJrt9OoTSSqvT_cpWQJ9Q}{sv-ocb3.iwojima.com}{10.30.4.19:9400}]
[2017-12-21T16:00:03,342][INFO ][o.e.c.r.a.AllocationService] [sv-ocb2] Cluster health status changed from [RED] to [YELLOW] (reason: [shards started [[filebeat-2017.12.07-000091][2]] ...]).
[2017-12-21T16:00:04,561][INFO ][o.e.c.r.a.AllocationService] [sv-ocb2] Cluster health status changed from [YELLOW] to [GREEN] (reason: [shards started [[filebeat-2017.12.07-000091][3]] ...]).

After this auto import, logstash kept throwing the below error for any requests that it got from filebeat.

[2017-12-21T16:00:07,687][ERROR][logstash.outputs.elasticsearch] Got a bad response code from server, but this code is not considered retryable. Request will be dropped {:code=>400, :response_body=>"{\"error\":{\"root_cause\":[{\"type\":\"illegal_argument_exception\",\"reason\":\"Alias [filebeat_logs] has more than one indices associated with it [[filebeat-2017.12.07-000091, filebeat-2017.12.07-000138]], can't execute a single index op\"}],\"type\":\"illegal_argument_exception\",\"reason\":\"Alias [filebeat_logs] has more than one indices associated with it [[filebeat-2017.12.07-000091, filebeat-2017.12.07-000138]], can't execute a single index op\"},\"status\":400}"}

So the alias ‘filebeat_logs’ which is supposed to point to the latest index also started pointing to the dangling index filebeat-2017.12.07-000091 after the auto import of the dangling index.

Is there something I can do so that I do not run into this issue again the next time any connectivity issues between the nodes occur?

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.