I've found some other topics that I think match this issue but none of them are open for replies. It appears that we have had a MetricBeat submission that contained a value of -1, which has then caused the value to wrap around to 2^64-1 and caused the long value parsing error. I even think I have figured out that we can prevent this from happening by using a processor with drop_event.when.regex in the future.
However, the problem that I'm having right now is we seem to have 17 entries that are trying to be processed and every time we start the service it crashes. We need to know how to find the bad value entries, remove them, then allow the remaining values to be processed and get things going again. I'm new to ElasticSearch and not exactly sure how to do this with the database.
I was able to get the ElasticSearch service back up and running but now our Kibana index appears to be empty. This entire issue started when we ran out of disk space. Is it possible that the Kibana index got corrupted from running out of disk space?
health status index uuid pri rep docs.count docs.deleted store.size pri.store.size
yellow open metricbeat-6.5.3-2019.01.13 LDAFBoDxToKZJNMDD9qn3w 1 1 0 0 261b 261b
red open filebeat-6.4.2-2018.10.13 KiNQWc-FTDKBwFZIVyOEqw 5 0 4371 0 1.3mb 1.3mb
yellow open winlogbeat-6.3.2-2018.12.29 jqN7Rg_JRoSmPNs4JlTZmQ 3 1 105267 0 83.2mb 83.2m
...snip...
red open .kibana ut3gfPhrTnaP_ft6kWYZHQ 1 0
...snip...
green open .monitoring-es-6-2019.01.09 KtBetV5cQ9KabqOFysxCjw 1 0 1868219 9826 965.2mb 965.2mb
yellow open metricbeat-6.5.3-2019.01.04 4kpBYlxhTgyyDw-eLvzTOQ 1 1 376917 0 109.4mb 109.4mb
I'm guessing that we are going to need to have backups for that index on the filesystem to recover all our Kibana configuration, saved searches, and dashboards?
So it looks like this problem hasn't gone away and we still occasionally are getting these Out of Range errors. It definitely appears to be related to MetricBeats as it is coming from the MetricBeats-* index and references the "system.process.cgroup.memory.mem.limit.bytes" field.
We are sending MetricBeats data directly from the agent to ElasticSearch, so I'm not sure if we can deal with this data prior to it being sent. Or maybe we just have a bad template? We are looking at moving to using LogStash as an intermediary. If we can use LogStash to look for the bad value and correct or exclude it, that would work as a solution.
Here is the last debug message:
[2019-02-07T19:25:16,256][DEBUG][o.e.a.b.TransportShardBulkAction] [node-1] [metricbeat-6.2.3-2019.02.07][0] failed to execute bulk item (index) index {[metricbeat-6.2.3-2019.02.07][doc][9Acjy2gBOOYVFKhmUk1Y], source[n/a, actual length: [2.6kb], max length: 2kb]}
org.elasticsearch.index.mapper.MapperParsingException: failed to parse field [system.process.cgroup.memory.mem.limit.bytes] of type [long]
at org.elasticsearch.index.mapper.FieldMapper.parse(FieldMapper.java:301)
<SNIP>
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) [?:1.8.0_201]
at java.lang.Thread.run(Thread.java:748) [?:1.8.0_201]
Caused by: com.fasterxml.jackson.core.JsonParseException: Numeric value (18446744073709551615) out of range of long (-9223372036854775808 - 9223372036854775807)
at [Source: org.elasticsearch.common.bytes.BytesReference$MarkSupportingStreamInputWrapper@82c0513; line: 1, column: 923]
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.