Logstash parse csv file with a column that contains multiple comma separated data points?

reillye · November 27, 2020, 12:28pm

I have a CSV file with multiple entries in the following format (the headers are not included in the actual data):

Time,      FT Data (2048 entries), SNR,   Frequency
430201865, 1000, 2000, 8500,...    4.50,  1266.255
280201865, 3500, 1400, 1750,...    12.75, 5548.127

It is a normal CSV file, except that the FT Data column contains 2048 comma separated data points in it, making it difficult to parse.

Is there a way with logstash to parse this or split it so that it knows to parse the first 2 columns and the last 2 columns as normal, but then each value in-between is its own document? I also need to apply a conversion to each of these data points as well (for example multiply value by 2).

The expected output from the above in ES would be:

Time,      ft_data, snr,   frequency
430201865  2000     4.50   1266.255
430201865  4000     4.50   1266.255
430201865  17000    4.50   1266.255
280201865  7000     12.75  5548.127
280201865  2800     12.75  5548.127
280201865  15000    12.75  5548.127

Is this possible to achieve given my input?

Badger · November 27, 2020, 2:15pm

You could try

    mutate { gsub => [ "message", " ", "" ] }
    ruby {
        code => '
            m = event.get("message")
            if m
                a = m.split(",")
                event.set("time", a.shift)
                newA = []
                ft = a.shift(2048)
                ft.each { |x|
                    newA << x.to_i * 2
                }
                event.set("ft_data", newA)
                event.set("snr", a.shift)
                event.set("frequency", a.shift)
            end
        '
    }
    split { field => "ft_data" }

reillye · November 27, 2020, 2:43pm

Thanks!

I'm trying this now but I'm getting an error at line event.set("ft_data", newA) saying

SyntaxError: (ruby filter code):12: syntax error, unexpected tIDENTIFIER.

I'm not familiar with ruby, is this the correct syntax for trying to set an array to a field?

Badger · November 27, 2020, 2:47pm

I do not know what could be causing that. The code worked for me.

reillye · November 27, 2020, 3:17pm

Yeah I can't figure it out either. Thanks anyways, I'll keep trying.

reillye · November 30, 2020, 11:37am

@Badger I got this working in the end! I'm not 100% sure why, I just changed the single quotes to double quotes and vice-versa and it started working.

Do you know if there is a way for to do a log calculation within the ruby code? So where I do the multiply by 2, I actually need to do the following calcaultion:

(ln(x)*(1/ln(10)))*10

Can this be done?

Badger · November 30, 2020, 3:43pm

Sure, ruby has a natural log function.

reillye · December 1, 2020, 2:33pm

Cheers, that worked perfectly!

One finally question, I've found out I need to do some additional calculations to work out a new field called frequency. This is determined by using the first value in the ft array and performing a calculation on it, and then incrementing that value by another number 2048 times for each item in the array.

At the minute I've got it to a point where I now have two arrays ft_data and frequencies. Logstash doesn't seem to let me split on both of these, is there a way to get each element from these into their own event?

For examples given the previous example I gave, the new output would look like

Time,      ft_data, snr,   frequency, altered_frequency
430201865  2000     4.50   1266.255,  1266.255
430201865  4000     4.50   1266.255   2266.255
430201865  17000    4.50   1266.255   3266.255
280201865  7000     12.75  5548.127   4266.255
280201865  2800     12.75  5548.127   5266.255
280201865  15000    12.75  5548.127   6266.255

Where altered_frequency is just the original first value of the frequency field (1266.255) and you accumulatively add 1000 it. See updated ruby code below, which works but then I'm not able to split on the two fields power and frequencies

ruby {
        code => '
            m = event.get("message")
            if m
                a = m.split(",")
                event.set("time", a.shift)
                newA = []
                frequencies = []
                ft = a.shift(2048)
                startFrequency = (ft.first().to_i * 1000) - (1024 * 24.414))
                ft.each { |x|
                    powerValues << (Math.log(x.to_i) * (1/Math.log(10))) * 10
                    frequencies << startFrequency
                    startFrequency += 1000
                }
                event.set("power", powerValues)
                event.set("snr", a.shift)
                event.set("frequency", a.shift)
                event.set("frequencies", frequencies)
            end
        '
    }

system · December 29, 2020, 2:33pm

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Parsing CSV file with multiple key values in one line Logstash	4	768	April 2, 2020
Parsing csv file which has some field/column containing separator as values Logstash	4	4649	July 6, 2017
Parsing CSV => columns semicolon separated => values with commas Logstash	4	2971	June 10, 2018
Parse string data with ; separator in logstash Logstash	6	6009	June 27, 2017
Csv file with multiline values getting parsed as single message Logstash	5	2706	July 6, 2017

Logstash parse csv file with a column that contains multiple comma separated data points?

Related topics