Hello,
I am trying to get a production ELK stack working again. It is stuck on 'indexing' in kibana. I do not care about past log data at all. I just want it to work again for new data. Here are the details:
I have recently had this ELK stack dropped in my lap to support. I did not set this stack up. It runs one elasticsearch master (es-master) and 2 data nodes (es-data01, es-data02) and also includes kafka.
This stack was in a broken state when I got into it. One of the elasticsearch nodes , es-data02, had been close to running out of disk, and another person had expanded the volume it was on. Elasticsearch was not able to start on that node. That is all the history I have.
After some basic troubleshooting, I found the entire /var/lib/elasticsearch/nodes directory was missing, and this prevented elasticsearch from started. I created a new empty node directory and it was able to start. However, kibana remains stuck in red and indexing.
On logstash, I found it was throwing errors like 'unavailable_shards_exception' for two indexes: vpc-logs and eb-logs.
I found that all shards for all indexes are in an UNASSIGNED state. I attempted to force allocate shards for vpc-logs and eb-logs using _cluster/reroute API with ""allow_primary": "true"" but I received "Unknown AllocationCommand [allocate]" response and found a thread suggesting this functionality may have been removed (https://github.com/elastic/elasticsearch/issues/18819).
I looked at the docs for Shard Allocation Filtering as an alternative but it is not clear to me how I could use this to solve my problem.
Please help me out. I just want kibana back into a working state with the same indexes it already has. I don't care at all about past log data, or which nodes are involved with any given shard.
I looked at _cat/shards and every single shard for every index is as follows:
UNASSIGNED CLUSTER_RECOVERED
root@es-master-1:/work# curl -XGET 10.10.53.10:9200/_cat/indices?v
health status index uuid pri rep docs.count docs.deleted store.size pri.store.size
red open cloudtrail-logs Lpuh176DRaSlepqZaMEkWA 5 1
red open eb-logs-1 NVGrsqPbQMSZHd38i1KWZA 5 1
red open kinesis-test1 el4fwJtvTPuG3EAAmglyEQ 1 1
red open .kibana ccysz1KaT2GMS1AJb-hzRw 1 1
red open s3-logs 8rRbxB7gSf6moU9xNtDTnw 2 1
red open elastalert_status wrPiv0IWThaen_xZZhCUmg 5 1
red open vpc-logs mfYgK5CxT3eoCl7X-UkZ4A 5 1
red open .watches t9BkX6n4TBmidPytJGVGYQ 1 1
red open elb-logs PfTxzluiSwuUUKxRhsfSHg 5 1
ubuntu@es-data-02:/$ curl -XGET 10.10.30.143:9200/_cluster/health?pretty
{
"cluster_name" : "jdp-production",
"status" : "red",
"timed_out" : false,
"number_of_nodes" : 3,
"number_of_data_nodes" : 2,
"active_primary_shards" : 0,
"active_shards" : 0,
"relocating_shards" : 0,
"initializing_shards" : 0,
"unassigned_shards" : 60,
"delayed_unassigned_shards" : 0,
"number_of_pending_tasks" : 0,
"number_of_in_flight_fetch" : 0,
"task_max_waiting_in_queue_millis" : 0,
"active_shards_percent_as_number" : 0.0
}
From Kibana (permanently stuck state):
ui settings | Elasticsearch plugin is red |
---|---|
plugin:kibana@5.6.3 | Ready |
plugin:elasticsearch@5.6.3 | Elasticsearch is still initializing the kibana index. |
plugin:console@5.6.3 | Ready |
plugin:metrics@5.6.3 | Ready |
plugin:timelion@5.6.3 | Ready |
Also should have added:
curl -XGET 10.10.53.10:9200
{
"name" : "es-master-01",
"cluster_name" : "companyname-production",
"cluster_uuid" : "9v9dm7wIQJe53O4GWWYqkg",
"version" : {
"number" : "5.6.3",
"build_hash" : "1a2f265",
"build_date" : "2017-10-06T20:33:39.012Z",
"build_snapshot" : false,
"lucene_version" : "6.6.1"
},
"tagline" : "You Know, for Search"
}