Grok's strange data conversions

Cornoualis · August 28, 2017, 3:56pm

Hi,

I have really weird conversions when using grok. I really don't get why the final values are so different from the originals:

I use the DATA grok pattern because I sometimes have numbers with exponents (eg: 12.12e-5) that are not recognized with BASE16FLOAT.

Is it a bug, or did I do something wrong?

Thanks in advance!

magnusbaeck · August 29, 2017, 6:22am

It seems grok only uses single-precision floating point numbers and their precision is about seven digits. Maybe other parts of Logstash (the mutate filter or the ruby filter) use double-precision floats.

BTW, don't sprinkle your grok expressions with DATA patterns like that. It's a real performance killer. Use a more exact expression or even a csv filter.

Cornoualis · August 29, 2017, 10:11am

Thank you Magnus! I'll try these solutions.

About the DATA pattern, I didn't really had a choice since sometimes, a value is so tiny that is it in the form "6.32e-5"...and that is not matched by grok with the BASE16FLOAT pattern...maybe a custom pattern you be better?

Or maybe dissect would be even better? According to the doc, if the lines structure is always the same, dissect could be a good choice.

Is there a way to measure the logstash performances?

magnusbaeck · August 29, 2017, 12:02pm

maybe a custom pattern you be better?

For example (?<name-of-field>[^;]+) which extracts one or more characters up until the next semicolon.

Or maybe dissect would be even better? According to the doc, if the lines structure is always the same, dissect could be a good choice.

Yes, or csv.

Is there a way to measure the logstash performances?

Have a look at the monitoring API.

Cornoualis · August 29, 2017, 12:04pm

Thanks a lot!!

system · September 26, 2017, 12:04pm

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.