Split array with different nested elemetnts

Hi,

I have a data stream coming in as array in the following format:

[
  {
    "dataType": 9,
    "payload": "[{\"isHeadingHome\":0,\"isTesing\":0}]",
    "data": "9"
  },
  {
    "dataType": 8,
    "payload": "[{\"FRONT_LEFT\":0.0,\"FRONT_RIGHT_current\":0.0}]",
    "data": "8"
  },
  {
    "dataType": 8,
    "payload": "[{\"FRONT_LEFT\":0.0,\"FRONT_RIGHT_current\":0.0}]",
    "data": "8"
  },
  {
    "dataType": 9,
    "payload": "[{\"isHeadingHome\":1,\"isTesing\":0}]",
    "data": "9"
  }
]

I need to split it into multiple events I tried with:
split {
field => "data"
}

but it does not work and I see the error:

[2022-02-12T23:47:52,591][WARN ][logstash.filters.split ][main][] Only String and Array types are splittable. field:data is of type = NilClass

Any idea what am I missing?

If your data is an array then it must be the contents of some named field. That is the field name you need to supply to the split filter.

field:data is of type = NilClass indicates that a top-level field named [data] does not exist.

You may want something like

    split { field => "yourFieldNameGoesHere" }
    json { source => "[someField][payload]" target => "anotherField" }
    mutate { rename => { "[anotherField][0]" => "[anotherField]" } }

Thank you for the quick reponse.

I have added manipulated the stream to be:

{
    "data": [
        {
            "dataType": 9,
            "payload": "[{\"isHeadingHome\":0,\"isTesing\":0}]",
            "data": "9"
        },
        {
            "dataType": 8,
            "payload": "[{\"FRONT_LEFT\":0.0,\"FRONT_RIGHT_current\":0.0}]",
            "data": "8"
        },
        {
            "dataType": 8,
            "payload": "[{\"FRONT_LEFT\":0.0,\"FRONT_RIGHT_current\":0.0}]",
            "data": "8"
        },
        {
            "dataType": 9,
            "payload": "[{\"isHeadingHome\":1,\"isTesing\":0}]",
            "data": "9"
        }
    ]
}

in the conf file:

 split {
        field => "data"
        }
           json {
        source => "payload"
        target => "parsedPayload"
      }

Now I am getting:
object mapping for [payload] tried to parse field [payload] as object, but found a concrete value

That error could occur when the output is not compatible with the target Elasticsearch index mappings. You may check the output is what you desired using stdout output plugin.

output {
  stdout { codec => rubydebug }
}

If you would share the output and what is the desired output, it will be easier to debug the pipeline. And also it could be caused by the lack of the last mutate filter @Badger shared, resulting that output filter plugin containing original (unparsed) "payload" field.

This the full error:

[2022-02-13T21:20:31,234][WARN ][logstash.outputs.elasticsearch][main][ccad6cd3409188a6dd4fe75a73d50d6e92db14a4768c7a636041e969cbd8eb12] Could not index event to Elasticsearch. {:status=>400, :action=>["index", {:_id=>nil, :_index=>"type9", :routing=>nil}, {"payload"=>"[{\"isHeadingHome\":0,\"isRelocalizing\":0,\"state\":3,\"stateStr\":\"Idle\",\"status\":1,\"statusStr\":\"Docked\"}]", "@version"=>"1", "environment"=>"production", "timestamp"=>1644787048349, "tags"=>["_split_type_failure"], "dataType"=>9, "deviceId"=>"114", "@timestamp"=>2022-02-13T21:17:28.349Z, "zoneId"=>31, "payloadJson"=>[{"statusStr"=>"Docked", "isHeadingHome"=>0, "status"=>1}]}], :response=>{"index"=>{"_index"=>"type9", "_type"=>"_doc", "_id"=>"klX19H4B6euHrZDSaS_E", "status"=>400, "error"=>{"type"=>"mapper_parsing_exception", "reason"=>"object mapping for [payload] tried to parse field [payload] as object, but found a concrete value"}}}}

What I am tryting to achieve is simply to split the array into single documents.

See this answer for a discussion of mapping exceptions and how to fix them

The error is pointing at {"payload"=>"[{\"isHeadingHome\":0,. The [payload] field is a string, and Elasticsearch expects it to be an object. Note that the event has a _split_type_failure tag. It may OK that payloadJson is an object -- that's not causing an error.

That Elasticsearch index error was caused by that the "payload" field of the output was a string.

In addition, you have to cope with _split_type_failure tag. It occurs when you use split filter on fields other than string or array.

the output does not seems to be the result of the shared conf file. There should be some more filters that might cause the error&failure. I wonder if the _split_type_failure came from another split filter. When you share the whole pipeline, somebody could check.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.