Logstash parse csv file with a column that contains multiple comma separated data points?

I have a CSV file with multiple entries in the following format (the headers are not included in the actual data):

Time,      FT Data (2048 entries), SNR,   Frequency
430201865, 1000, 2000, 8500,...    4.50,  1266.255
280201865, 3500, 1400, 1750,...    12.75, 5548.127

It is a normal CSV file, except that the FT Data column contains 2048 comma separated data points in it, making it difficult to parse.

Is there a way with logstash to parse this or split it so that it knows to parse the first 2 columns and the last 2 columns as normal, but then each value in-between is its own document? I also need to apply a conversion to each of these data points as well (for example multiply value by 2).

The expected output from the above in ES would be:

Time,      ft_data, snr,   frequency
430201865  2000     4.50   1266.255
430201865  4000     4.50   1266.255
430201865  17000    4.50   1266.255
280201865  7000     12.75  5548.127
280201865  2800     12.75  5548.127
280201865  15000    12.75  5548.127

Is this possible to achieve given my input?

You could try

    mutate { gsub => [ "message", " ", "" ] }
    ruby {
        code => '
            m = event.get("message")
            if m
                a = m.split(",")
                event.set("time", a.shift)
                newA = []
                ft = a.shift(2048)
                ft.each { |x|
                    newA << x.to_i * 2
                }
                event.set("ft_data", newA)
                event.set("snr", a.shift)
                event.set("frequency", a.shift)
            end
        '
    }
    split { field => "ft_data" }

Thanks!

I'm trying this now but I'm getting an error at line event.set("ft_data", newA) saying

SyntaxError: (ruby filter code):12: syntax error, unexpected tIDENTIFIER.

I'm not familiar with ruby, is this the correct syntax for trying to set an array to a field?

I do not know what could be causing that. The code worked for me.

Yeah I can't figure it out either. Thanks anyways, I'll keep trying.

@Badger I got this working in the end! I'm not 100% sure why, I just changed the single quotes to double quotes and vice-versa and it started working.

Do you know if there is a way for to do a log calculation within the ruby code? So where I do the multiply by 2, I actually need to do the following calcaultion:

(ln(x)*(1/ln(10)))*10

Can this be done?

Sure, ruby has a natural log function.

Cheers, that worked perfectly!

One finally question, I've found out I need to do some additional calculations to work out a new field called frequency. This is determined by using the first value in the ft array and performing a calculation on it, and then incrementing that value by another number 2048 times for each item in the array.

At the minute I've got it to a point where I now have two arrays ft_data and frequencies. Logstash doesn't seem to let me split on both of these, is there a way to get each element from these into their own event?

For examples given the previous example I gave, the new output would look like

Time,      ft_data, snr,   frequency, altered_frequency
430201865  2000     4.50   1266.255,  1266.255
430201865  4000     4.50   1266.255   2266.255
430201865  17000    4.50   1266.255   3266.255
280201865  7000     12.75  5548.127   4266.255
280201865  2800     12.75  5548.127   5266.255
280201865  15000    12.75  5548.127   6266.255

Where altered_frequency is just the original first value of the frequency field (1266.255) and you accumulatively add 1000 it. See updated ruby code below, which works but then I'm not able to split on the two fields power and frequencies

ruby {
        code => '
            m = event.get("message")
            if m
                a = m.split(",")
                event.set("time", a.shift)
                newA = []
                frequencies = []
                ft = a.shift(2048)
                startFrequency = (ft.first().to_i * 1000) - (1024 * 24.414))
                ft.each { |x|
                    powerValues << (Math.log(x.to_i) * (1/Math.log(10))) * 10
                    frequencies << startFrequency
                    startFrequency += 1000
                }
                event.set("power", powerValues)
                event.set("snr", a.shift)
                event.set("frequency", a.shift)
                event.set("frequencies", frequencies)
            end
        '
    }

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.