CSV Ingest Column 1 = Time, Row 1 = Device

I have a CSV to import that has been formatted for use in Excel. As a result, and necessary for it to open in Excel, it has not only been chopped into several files (which I can handle) but it has also been formatted such that the first column is time and the first row is the device identifier. For example:
title,device1,device2,device3,device4
timestamp1,data.dev1.ts1,data.dev2.ts1,data.dev3.ts1,data.dev4.ts1
timestamp2,data.dev1.ts2,data.dev2.ts2,data.dev3.ts2,data.dev4.ts2
timestamp3,data.dev1.ts3,data.dev2.ts3,data.dev3.ts3,data.dev4.ts3

A couple years ago I did a transpose similar using perl which I suppose is an option here (though i'm, by no means proficient with perl). I was wondering if anyone else has come across this and/or has any clever strategies for this kind of data sorting?

What are you looking for as the output? Do you want a bunch of events like these?

{ "device": "device1", "@timestamp": "parsed value of timestamp1", "data": "data.dev1.ts1" }
{ "device": "device2", "@timestamp": "parsed value of timestamp1", "data": "data.dev2.ts1" }
{ "device": "device3", "@timestamp": "parsed value of timestamp1", "data": "data.dev3.ts1" }
{ "device": "device4", "@timestamp": "parsed value of timestamp1", "data": "data.dev4.ts1" }
{ "device": "device1", "@timestamp": "parsed value of timestamp2", "data": "data.dev1.ts2" }
...

Yes, that would be fantastic! I would probably do some additional tweaking but if know of a good strategy or you can point me in the direction of doing that, it would be a huge help. My previously used perl code is way off

perl -F, -lane '$s=shift @F;print "$s,$_" for @F'

I am not saying this is a good use case for logstash, but it can be done.

    ruby {
        code => '
            if ! @deviceList
                @deviceList = event.get("message").split(",")
                @deviceList.shift(1)
            else
                data = event.get("message").split(",")
                timestamp = data.shift(1)[0]

                data.each_index { |x|
                    newEvent = LogStash::Event.new
                    newEvent.set("timestamp", timestamp)
                    newEvent.set("device", @deviceList[x])
                    newEvent.set("data", data[x])
                    new_event_block.call(newEvent)
                }
            end
            event.cancel # Drop data from file
        '
    }

You must set pipeline.workers to 1 for this to work, and make sure pipeline.ordered has the value you want (true) (or auto in 7.x but not 8.x).

You can use a date filter to parse your [timestamp] field into [@timestamp]

Thank you for that. I haven't used ruby in quite some time but that definitely seems feasible...though it makes sense that logstash may not be the best tool for this. Especially since it's a manual CSV ingest, I think I'll work with perl to shoehorn the data into a more logstash friendly alignment.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.