Courier Fetch: N of N shards failed in kibana

shwesinhan · March 19, 2018, 8:04am

hi there.

i run elk since 6 months ago. it is being stable.

but these days i got this msg very often

courier Fetch: 23 of 420 shards failed.

could anybody explain on this msg?

i have no idea how to fix.

my scenario is that app logs streaming > logstash > aws elastic search domain > kibana ui

Christian_Dahlqvist · March 19, 2018, 8:06am

What is the output of the cluster stats API?

shwesinhan · March 19, 2018, 8:10am

here it is @Christian_Dahlqvist

{
"_nodes": {
"total": 1,
"successful": 1,
"failed": 0
},
"cluster_name": "891349355538:pgwdev",
"timestamp": 1521446981178,
"status": "yellow",
"indices": {
"count": 88,
"shards": {
"total": 432,
"primaries": 432,
"replication": 0,
"index": {
"shards": {
"min": 1,
"max": 5,
"avg": 4.909090909090909
},
"primaries": {
"min": 1,
"max": 5,
"avg": 4.909090909090909
},
"replication": {
"min": 0,
"max": 0,
"avg": 0
}
}
},
"docs": {
"count": 5390254,
"deleted": 1
},
"store": {
"size": "3.9gb",
"size_in_bytes": 4210638451,
"throttle_time": "0s",
"throttle_time_in_millis": 0
},
"fielddata": {
"memory_size": "845.4kb",
"memory_size_in_bytes": 865744,
"evictions": 0
},
"query_cache": {
"memory_size": "4.4mb",
"memory_size_in_bytes": 4686202,
"total_count": 1546397,
"hit_count": 889013,
"miss_count": 657384,
"cache_size": 36817,
"cache_count": 36821,
"evictions": 4
},
"completion": {
"size": "0b",
"size_in_bytes": 0
},
"segments": {
"count": 2292,
"memory": "31.3mb",
"memory_in_bytes": 32861595,
"terms_memory": "27.6mb",
"terms_memory_in_bytes": 29018539,
"stored_fields_memory": "1.6mb",
"stored_fields_memory_in_bytes": 1768592,
"term_vectors_memory": "936b",
"term_vectors_memory_in_bytes": 936,
"norms_memory": "29.3kb",
"norms_memory_in_bytes": 30016,
"points_memory": "65.5kb",
"points_memory_in_bytes": 67168,
"doc_values_memory": "1.8mb",
"doc_values_memory_in_bytes": 1976344,
"index_writer_memory": "0b",
"index_writer_memory_in_bytes": 0,
"version_map_memory": "0b",
"version_map_memory_in_bytes": 0,
"fixed_bit_set": "0b",
"fixed_bit_set_memory_in_bytes": 0,
"max_unsafe_auto_id_timestamp": 1520593756131,
"file_sizes": {}
}
},
"nodes": {
"count": {
"total": 1,
"data": 1,
"coordinating_only": 0,
"master": 1,
"ingest": 1
},
"versions": [
"5.5.2"
],
"os": {
"available_processors": 1,
"allocated_processors": 1,
"names": [
{
"count": 1
}
],
"mem": {
"total": "1.9gb",
"total_in_bytes": 2093498368,
"free": "141.2mb",
"free_in_bytes": 148090880,
"used": "1.8gb",
"used_in_bytes": 1945407488,
"free_percent": 7,
"used_percent": 93
}
},
"process": {
"cpu": {
"percent": 2
},
"open_file_descriptors": {
"min": 1412,
"max": 1412,
"avg": 1412
}
},
"jvm": {
"max_uptime": "9.8d",
"max_uptime_in_millis": 853696503,
"mem": {
"heap_used": "434.7mb",
"heap_used_in_bytes": 455873608,
"heap_max": "1015.6mb",
"heap_max_in_bytes": 1065025536
},
"threads": 113
},
"fs": {
"total": "11.6gb",
"total_in_bytes": 12548489216,
"free": "7.6gb",
"free_in_bytes": 8254406656,
"available": "7gb",
"available_in_bytes": 7593385984
},
"network_types": {
"transport_types": {
"netty4": 1
},
"http_types": {
"filter-jetty": 1
}
}
}
}

Christian_Dahlqvist · March 19, 2018, 8:13am

That is a lot of shards given the amount off heap you have available in your cluster. Read this blog post for some guidance on how large you shards should be and how many you should aim to have in your cluster.

You can use the shrink index API to reduce the shard count by some margin. If you need to go further, you may need to use the reindex API to reindex your data into e.g. monthly indices instead.

shwesinhan · March 26, 2018, 8:21am

if i use reindex API, there will be duplication problem??
and using monthly indices still same performance with daily indices??

Christian_Dahlqvist · March 26, 2018, 8:29am

Given how little data you have in the cluster, I don;'t see monthly indices getting very large, so I would expect them to perform much better. As seen in the blog post I linked to we often recommend shard sizes in the tens of GB, which you seem unlikely to reach even with monthly indices.

While reindexing is going on, you could end up with data being duplicated as the monthly index would potentially also match the index pattern. Once the reindexing has completed I would however expect the daily indices to be deleted. As you have relatively little data in your cluster I would not necessarily expect reindexing to take very long.

If this is not acceptable, you can reindex into an index that does not match the daily index pattern and then instead create an alias for the index at the time you delete the daily indices. This will reduce the amount off time duplicates can be seen in the system.

shwesinhan · March 26, 2018, 8:51am

@Christian_Dahlqvist thanks for your reply and advise.

system · April 23, 2018, 8:51am

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
ES 2 Courier Fetch: N of N shards failed Elasticsearch	7	948	December 28, 2017
Courier Fetch: 5 of 5 shards failed on Kibana Elasticsearch	2	3190	July 5, 2017
Courier Fetch: 33 of 893 shards failed Kibana	6	6317	July 6, 2017
Courier Fetch: 1 of 610 shards failed Kibana	8	5713	June 27, 2017
Warning Courier Fetch: shards failed Kibana	7	4021	April 1, 2017

Courier Fetch: N of N shards failed in kibana

Related topics