Hi,
I have a two nodes elk cluster, with elasticsearch running version 1.7.2.
Somehow a couple of days ago, elasticsearch started giving below exception continuously. When this error happened, indexer can still send data to elasticsearch in the beginning, but after a few days, it started to fail with error 503.
Elasticsearch exception:
[ERROR][marvel.agent ] [clustername] Background thread had an uncaught exception:
org.elasticsearch.ElasticsearchException: failed to refresh store stats
at org.elasticsearch.index.store.Store$StoreStatsCache.refresh(Store.java:1573)
at org.elasticsearch.index.store.Store$StoreStatsCache.refresh(Store.java:1558)
at org.elasticsearch.common.util.SingleObjectCache.getOrRefresh(SingleObjectCache.java:55)
at org.elasticsearch.index.store.Store.stats(Store.java:290)
at org.elasticsearch.index.shard.IndexShard.storeStats(IndexShard.java:639)
at org.elasticsearch.action.admin.indices.stats.CommonStats.(CommonStats.java:139)
at org.elasticsearch.action.admin.indices.stats.ShardStats.(ShardStats.java:55)
at org.elasticsearch.indices.IndicesService.stats(IndicesService.java:231)
logstash indexer error:
:message=>"retrying failed action with response code: 503
The server's hardware, CPU, memory all looks OK.
The problem goes away after I restart the elasticsearch service.
What caused this problem? How to prevent it from happening again? Is there a way to monitor this error or ES can send out notifications?