Field spilt

Hi Guys,

I am ingesting some IoT sensor data into Elasticsearch via logstash, the payload of the message is a HEX string which currently looks something like this;

129, 91, 0, 0, 28, 28, 29, 0, 4, 0, 0, 0, 11, 38, 0, 238

This is currently all in one field when it arrives in Elasticsearch, I would like to this field on the , into separate fields. Does anyone out there have experience of doing this?

Thanks in advance
Nick

You can parse this using csv filter

csv{
	columns => [field1,field2,field3.............]
}

You can mention some meaningful fieldname's as you like.

Hi sjabiulla,

I have tried your suggestion of using the CSV filter but I get the below error,

Error parsing csv {:field=>"hex", :source=>[129, 75, 10, 0, 24, 24, 24, 0, 4, 0, 0, 0, 60, 54, 0, 214], :exception=>#<NoMethodError: private method `gets' called for [129, 75, 10, 0, 24, 24, 24, 0, 4, 0, 0, 0, 60, 54, 0, 214]:Array>}

Any ideas?

can you please provide your config and a sample input

My current config is

filter {
 csv {
    source => "hex"
    columns => [field1,field2]
 }
}

the input would be

129, 75, 10, 0, 24, 24, 24, 0, 4, 0, 0, 0, 60, 54, 0, 214

remove source from csv filter.

filter {
    csv {
        columns => [field1,field2]
    }
}

I have tested the below config using the sample input you provided and got all fields splitted properly using csv filter.

input {
    stdin {
    }
}

filter {
    csv {
        columns => [field1,field2]
    }
}

output {
    stdout {
    codec => rubydebug {}
    }
}

Output:

{
        "field1" => "129",
      "column12" => " 0",
          "host" => "01hw1344293",
      "@version" => "1",
       "column8" => " 0",
      "column16" => " 214",
       "column5" => " 24",
      "column11" => " 0",
      "column14" => " 54",
    "@timestamp" => 2019-07-10T13:06:16.496Z,
       "column4" => " 0",
      "column10" => " 0",
        "field2" => " 75",
       "column3" => " 10",
      "column13" => " 60",
      "column15" => " 0",
       "column9" => " 4",
       "message" => "129, 75, 10, 0, 24, 24, 24, 0, 4, 0, 0, 0, 60, 54, 0, 214\r",
       "column6" => " 24",
       "column7" => " 24"
}

I really need to the CSV data to come from the hex field as that is the field it arrives in Elastic search in, is it not possible to do it that way?

That's the reason I have asked you for the config and sample input. If you provide me the wrong config and wrong sample input then of-course my answers won't help you.

Here is the full pipeline

input {
  http {
   host => "0.0.0.0"
   port => "8080"
  }
}

filter {
 json {
  source => "message"
  target => "json_log"
}
if [message] =~ /^\s*$/ {
drop { }
 }
}
filter {
ruby {
init => "require 'base64'"
code => "event.set( '[payload]', Base64.decode64(event.get('[json_log][data]')).unpack('C*'))"
 }
}
 filter {
 mutate {
   copy => { "payload" => "hex" }
 }
}
filter {
csv {
    source => "hex"
    columns => [field1,field2]
 }
}
output

The hex values come in encrypted thats why have to decrypt and end up with the hex values

You can convert the array of integers to a string that csv will parse using

    ruby { code => 'event.set("payload", event.get("payload").to_s[1..-2])' }

The [1..-2] is to strip off the [ and ] that to_s will add.

Thank you badger I shall try it this morning

I added the ruby code to the pipeline and got the following error
Could not index event to Elasticsearch. {:status=>400, :action=>["index", {:_id=>nil, :_index=>"############-2019.07.11", :_type=>"_doc", :routing=>nil}, #LogStash::Event:0x1f4d7ec7], :response=>{"index"=>{"_index"=>"##########-2019.07.11", "_type"=>"_doc", "_id"=>"qbIq4GsB_o88aajI8mVC", "status"=>400, "error"=>{"type"=>"mapper_parsing_exception", "reason"=>"failed to parse field [payload] of type [long] in document with id 'qbIq4GsB_o88aajI8mVC'", "caused_by"=>{"type"=>"illegal_argument_exception", "reason"=>"For input string: "129, 99, 15, 0, 21, 20, 21, 0, 4, 0, 0, 0, 12, 59, 0, 202""}}}}}

Not sure where to go from here

I think that is telling you that payload is a string, but elasticsearch has a mapping that tells it to expect a long in that field.

Renaming the field with mutate might fix it.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.