Logstash filter : Split plugin doesn't seems to work

Hi everyone,

I tried to use the split plugin for splitting an array into multiple events.
However, it seems that it's not capable of interpreting field like "%{type[1]}".

As entry, I use this kind of message (this is an example):

{
"host_sensor" => "10.0.2.1",
"@timestamp" => "2016-08-22T12:17:50.463Z",
"sensorsVegetablesTable" => [
{
"sensorVegetablesTemperature" => 21,
"sensorVegetablesHumidity" => 40,
"sensorVegetablesId" => "1111111"
},
{
"sensorVegetablesTemperature" => 24,
"sensorVegetablesHumidity" => 51,
"sensorVegetablesId" => "2222222"
}
],
"type" => "array.sensorsVegetablesTable",
"array_name" => "sensorsVegetablesTable"
}

Here is my logstash configuration file :

input {
...
}

filter {
mutate {
split => { "type" => "."}
}

# Want to split into multiple events
split {
    field => "%{type[1]}"
}

}
output {
stdout {
codec => rubydebug
}
}

As result, I have an error message : "LogStash::ConfigurationError: Only String and Array types are splittable. field:%{type[1]} is of type = NilClass"

What can I do for resolving this issue ? Does someone know if the split plugin is already working now ? (Because it doesn't seems to be well supported on Github).

Thank you

EDIT : JSON message has been modified (it was a ruby debug view)

Your type field is a string, not an array, and doesn't contain any newline characters so there's nothing to split. What do you expect Logstash to do to your example event above?

I'd try %{[type][1]} instead of %{type[1]}.

Hello,

Thanks for the reply,

My field "type" become an array because there is a mutate filter just before the split filter.
I tried your solution and unfortunately, I have the same kind of issue:
LogStash::ConfigurationError: Only String and Array types are splittable. field:%{[type][1]} is of type = NilClass

I expect to have multiple events from this only one event with all fields except for the field that have the array name which have to be split (for each entry of the array).
I want to do that because Kibana doesn't support yet nested objects so I had to change my conception in Elasticsearch.

Thank you

Oh, didn't notice the mutate filter that splits type into an array. But things still don't make sense to me. The expectation would be for [type][1] to expand to "sensorsVegetablesTable", right? And what's the point of running the split filter on that ? You're not expecting it to split your sensorsVegetablesTable field just because [type][1] contains "sensorsVegetablesTable"?

My objective is explained thanks to this example :slight_smile: :

{
"host_sensor" => "10.0.2.4",
"@timestamp" => "2016-08-22T13:35:55.932Z",
"sensorsVegetablesTable" => [
{
"sensorVegetablesTemperature" => 21,
"sensorVegetablesHumidity" => 40,
"sensorVegetablesId" => "1111111"
},
{
"sensorVegetablesTemperature" => 24,
"sensorVegetablesHumidity" => 51,
"sensorVegetablesId" => "2222222"
}
],
"origin" => "snmp_script-rabbitmq",
"array_name" => "sensorsVegetablesTable"
}

This event will be split into these two events (after the filter):

{
"host_sensor" => "10.0.2.4",
"@timestamp" => "2016-08-22T13:35:55.932Z",
"sensorsVegetablesTable_columns" => {
"sensorVegetablesTemperature" => 21,
"sensorVegetablesHumidity" => 40,
"sensorVegetablesId" => "1111111"
},
"origin" => "snmp_script-rabbitmq",
"array_name" => "sensorsVegetablesTable"
}

and

{
"host_sensor" => "10.0.2.4",
"@timestamp" => "2016-08-22T13:35:55.932Z",
"sensorsVegetablesTable_columns" => {
"sensorVegetablesTemperature" => 24,
"sensorVegetablesHumidity" => 51,
"sensorVegetablesId" => "222222"
},
"origin" => "snmp_script-rabbitmq",
"array_name" => "sensorsVegetablesTable"
}

"sensorsVegetablesTable" will be split and each entry is stored in "sensorsVegetablesTable_columns" field.

So, to answer you :slight_smile:, yes, I'm expecting to split my array just because [type][1] contains the name of the array, like the way when you "add_tag" like this:

mutate {
add_field =>
{
"[@metadata][type_es]" => "logs_Variables"
"[@metadata][index_es]" => "%{type[0]}"
}
remove_field => [ "type", "@version"]
}

According to the filter :

The split filter clones an event by splitting one of its fields and
placing each value resulting from the split into a clone of the original
event. The field being split can either be a string or an array.An example use case of this filter is for taking output from the
exec input plugin which emits one event for
the whole output of a command and splitting that output by newline -
making each line an event.The end result of each split is a complete copy of the event
with only the current split section of the given field changed.

NOTE : Oops, I would like to add that this is not the right JSON file because it's a ruby debug view !!
Good JSON file doesn't contain [0] and [1]

Sorry for all the back and forth here when the evidence was there from the beginning. The field option isn't parsed for variable references i.e. any %{whatever} in the string will be taken literally. That's why Logstash interprets your filter as a request to split the field named %{type[1]} while there's obviously no such field. Changing this behavior would be trivial and shouldn't have any side-effects. I've filed https://github.com/logstash-plugins/logstash-filter-split/issues/19 for you.

Thank you for your reply and for the issue.
I took a look on the code this morning before posting this post and it seems that the value is recovered on the first lines of the filter method (by event(@field)), I found a little trick that would be to call the "sprinft" method of the "event" object for taking into consideration a translation of %{var_name}).
Hope it could help.

I found a little trick that would be to call the "sprinft" method of the "event" object for taking into consideration a translation of %{var_name}).

Yes, that's exactly how to solve the problem. Feel free to send a pull request.

A pull request has been proposed for this issue:

Thank you.