Gsub for numbers?

jclose · August 14, 2015, 10:23pm

This is a take on a previous topic I posted.

I have values coming across the wire that are meant to go into integer/double/float... values in Elastic, but they are coming cross as string values.

Within logstash, I need to convert these values to null. I can do this with gsub for strings. Is there another way to do with for integers/doubles...?

magnusbaeck · August 15, 2015, 8:22am

Not sure exactly what you mean. If your fields indeed are numbers, why would you need to turn them into null? And if they're strings, why doesn't the gsub option work? Perhaps you can give an example or two of input message and the expected output.

jclose · August 15, 2015, 2:05pm

So the issue comes when the fields are numbers, but the data has become corrupted, which happens. If I try to forward the mistyped data to Elastic, I get the dreaded Error 400, which pukes into my log files (and can quickly fill up the drive on my logstash shipper).

I need logstash to normalize the data before it goes to Elastic. I am using templates in Elastic, so it is expecting specific data sites.

Say I have a field that is sent to logstash that is meant to be a number (i.e. {"my_int" => "32" }), but it ends up being bad data (i.e. {"my_int" => "[ ]"}). I can use grok/regex to find the data in the data stream, but there's nothing that lets me normalize data that may be bad (from a number standpoint).

I need something in logstash to ensure that I send {"my_int" => null} so that Elastic doesn't throw a fit.

Right now, I can use gsub for strings, but don't have anything for numbers.

magnusbaeck · August 16, 2015, 3:10pm

Again, an example would help.

Regexp conditionals should work fine:

if [message] !~ /^[0-9]+$/ {
  ...
}

Or, tighten your grok expression to only match numbers and add the fields with any value you like if there's missing.

I don't know what your gsub looks like so I don't understand why it doesn't work with numbers.

Joshua_Rich · August 17, 2015, 6:06am

Why do need to store the field at all for the document where it is corrupted? Couldn't you just use a remove_field parameter and drop the field altogether?

jclose · August 17, 2015, 4:13pm

So how do I do a conditional remove_field? I only want to remove the field if it meets a certain condition.

jclose · August 17, 2015, 4:27pm

So, for an example, I have a mutate like the following

mutate {
     gsub => [ "my_int_value", "(-|\[\])", "null" ]
}

This is not working. I am still getting "-" values in the fields getting sent to Elastic, instead of null.

What happens if I do tighten the grok, and now the message coming in does not meet the exact match? Does it throw out the whole message? I just want to throw out single bad fields.

Joshua_Rich · August 18, 2015, 12:14am

If you've extracted the field with a grok filter, you can just use a logstash conditional to remove it with a mutate filter:

if [field] !~ /^[0-9]+$/ {
  mutate {
    remove_field => [ "field" ]
  }
}

Just put something like that after the grok that extracts/creates that field.

For a grok that doesn't match, the message will be tagged with _grokparsefailure. You can process those messages later in the Logstash pipeline if you want.

Topic		Replies	Views
Use gsub to add points to an integer Logstash	4	431	April 1, 2020
Best way to change a list of not wanted values Logstash	4	1351	July 6, 2017
Regex to integer\float Logstash	5	2284	November 4, 2022
How do I set a field type using grok? Logstash	4	20529	July 6, 2017
Mutate Gsub not working Logstash	15	2746	June 26, 2018

Gsub for numbers?

Related topics