Parsing CSV file with multiple key values in one line


I'm new to the ELK stack and have been struggling to solve one problem. I'm trying to write a Logstash configuration to parse a log file of csv format as follow:

"time_stamp", 2020-02-28 16:35:16.048, "name", "Dan", "age", 22
"time_stamp", 2020-02-28 16:36:16.048, "name", "John", "age", 23
"time_stamp", 2020-02-28 16:37:16.048, "name", "Lu", "age", 24

The above csv log line has the following structure:

column name, value of type date time, column name, value of type String, column name, value of type integer.

Can someone show me how to use Logstash to process that log line so that when I send it into Elasticsearch and look into Kibana I can see

time_stamp: 2020-02-28 16:35:16.048, name: Dan, age: 22 ?

Note 1: I also want to keep the original data type (date time for "time_stamp" and integer for "age" instead of String) so that I can do some numerical visualizations in Kibana later.

Note 2: for the purpose of my program, I wouldn't know the column names. As a result, I cannot produce a log file of csv format such as 2020-02-28 16:35:16.048, Dan, 22 because Logstash configuration wouldn't be able to know what field each value belongs to.

Note 3: I'm currently thinking of for loop because the number of columns may change.

Thank you very much !!

You could do that in ruby

    mutate { split => { "message" => "," } }
    ruby {
        code => '
            message = event.get("message")
            if message.is_a? Array and message.length % 2 == 0
                while message.length > 0 do
                    item = message.shift(2)
                    k = item[0].sub(/^[" ]*/, "").sub(/[" ]*$/, "")
                    v = item[1].sub(/^[" ]*/, "").sub(/[" ]*$/, "")
                    if v.to_i.to_s == v
                        v = v.to_i
                    event.set(k, v)
        remove_field => [ "message" ]
    date { match => [ "time_stamp", "yyyy-MM-dd HH:mm:ss.SSS" ] }

I apologize if .sub(/^[" ]*/, "").sub(/[" ]*$/, "") makes your eyeballs bleed. I can't think of the right way to do it right now.

1 Like

Thank you very much !!

This is exactly what I need. Though I have one question, how do you handle cases when the value contains comma or delimiter ?

Then you would have to write a much more complicated parser than mutate+split.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.