Cluster ID : 45cd54
Shard location-541275-1602
Observing KOPF plugin show that the shard is in an initializing state for more than 10 minutes now. Please restart or do your magic. I have stop all indexing activity and all known connection to it but it still did not auto recover. I think it is also holding up the snapshot job with this error:
`[2016-02-22 00:48:34,686][INFO ][rest.suppressed ] /_status Params: {index=_status, recovery=true} java.lang.IllegalArgumentException: No feature for name [_status] at org.elasticsearch.action.admin.indices.get.GetIndexRequest$Feature.fromName(GetIndexRequest.java:82) at
.
.
The problem with this shard is likely due to this event in your logs: [2016-02-21 18:08:29,172][WARN ][index.engine ] [instance-0000000009] [location-541275-1602][0] failed engine [merge failed] org.apache.lucene.index.MergePolicy$MergeException: java.lang.OutOfMemoryError: Direct buffer memory at org.e …
The OutOfMemory-event should normally trigger a node-restart, and we're looking into why that does not happen. I've restarted your node.
To avoid this OOM-events at all, which any production cluster must, you'll need to reduce your memory requirements and/or upgrade the cluster. Production clusters should also use the high availability offerings, so there's replicas in multiple availability zones. The memory pressure indicator shows the current memory pressure. Your OOM seems to be caused by a sudden increase, e.g. due to overwhelming the cluster with concurrent index requests that queue up.
Hmm. I just realized a query which you may help me with understanding. In the event that the logstash elasticsearch input plugin hit an issue; (For my case "Error: Unable to establish loopback connection"), will it clear the scroll?
What happens when the plugin restart? Does it restarts or continue?
Can restarting of the plugin contribute to the additional memory utilization which eventually lead to the OOM exception?
input {
elasticsearch {
[blah blah ]
docinfo => true
type => "tcpdump_final"
}
}
filter {
metrics {
meter => "events"
add_tag => "metric"
}
}
output {
if [type] == "tcpdump_final" {
elasticsearch {
}
}
if "metric" in [tags] {
stdout {
codec => line {
format => "rate: %{[events][rate_1m]} ::: count %{[events][count]}"
}
}
}
}
Upon hitting an error with the input, the plugin restarts and your search query get executed again. The important thing to note is that you are going to be assigned a new scroll_id. I derived at this conclusion from the metric filter printout showing the count > than the original quantity of document. It has to restart for the count to exceed the total number of document in the original. 2nd observation which is the important one is from querying:
GET /_nodes/stats/indices/search?pretty
The search context keeps on increasing. I believe this contributes to my elasticsearch instance hitting the OOM exception. (Left it running overnight. Never expected that.)
The behaviors makes it tough for those with limited resources to perform reindexing using logstash.
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.