Netflow and type long (fwd_flow_delta_bytes)

I have an issue where data passed from logstash to es is to big for type long:

Could not index event to Elasticsearch. {:status=>400, :action=>["index", {:_id=>nil, :_index=>"elastiflow-2018.03.06", :_type=>"doc", :_routing=>nil}, #LogStash::Event:0x7cb6f62d], :response=>{"index"=>{"_index"=>"someindex-2018.03.06", "_type"=>"doc", "_id"=>"33c0-2EBigYnKGPuQIO2", "status"=>400, "error"=>{"type"=>"mapper_parsing_exception", "reason"=>"failed to parse [netflow.fwd_flow_delta_bytes]", "caused_by"=>{"type"=>"json_parse_exception", "reason"=>"Numeric value (12398523043251355676) out of range of long (-9223372036854775808 - 9223372036854775807)\n at [Source: org.elasticsearch.common.bytes.BytesReference$MarkSupportingStreamInputWrapper@17a81cfe; line: 1, column: 147]"}}}}}

How can I handle the situation where the value is too big for "long"?

Use float. You end up with 12398523043251356000 in elasticsearch instead of 12398523043251355676, but at least it is in there.

There are a couple things here...

  • In Elasticsearch integers and longs are signed values. However in the networking and systems management world there are a lot of unsigned values (usually counters for things like network traffic or storage bytes written/read). So eventually EVERYONE using Elastic for metrics monitoring will hit this issue.

  • That said, you probably have a different issue. fwd_flow_delta_bytes is not a counter, it is a delta - i.e. the number of bytes observed for this flow since the last record forwarded for this flow. The time between records will be determined by timeout settings for flow accounting on your device. So let's say that your flow timeout settings were really long, as in every 24 hours, and that you actually had a flow that was alive that long. Even if the interface was a fully saturated 100Gb/s you still wouldn't have enough traffic in 24 hours produce a delta value that large. In fact you would need at least the equivalent of 11,500 100Gb/s interfaces to get that value. So clearly your data is bogus.

So what could be causing your bad data?

  1. The device is sending a malformed flow, or the flows don't match the template the device is forwarding so logstash-codec-netflow doesn't know how to properly decode the flow record and you are get these weird values.

  2. There is some unusual data structure being sent by the device and even though the template and flow records from the device are valid, logstash-codec-netflow doesn't know how to handle it.

I recommend getting a packet capture of the flows and opening an issue on the repository for logstash-codec-netflow.

Thanks a lot for the feedback. I will follow up like suggested.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.