Keep variables in Logstash

I have an xml file which I am currently ingesting and it is quite nested. To simplify it, I am splitting the filelds as follows:
<(Header|parameter|differentparameter)
This is configured in the input section, with multiline codec configuration.
This works well, but now my output comes in separate JSON for header, parameter, different parameter.
Is it possible to keep variables within logstash, so I can add a field or reference that differentparameter, is part of parameter x, with header y.
Basically I need to keep some variables and add the values of Header + parameter to differentparameter as new fields.
I would like to have a loop, which makes sure it catches the last header + parameter, as these values will change.

Any ideas? It would be great to know if someone was able to preserve some value from the pipeline and add a new field with this, to the following output. Since this can happen a couple of time, we would need to be capable of updating this variable, until a new one is present.

Please show

  • an example input file,
  • what you currently get from Logstash (use a stdout { codec => rubydebug } output, and
  • what you'd like to get from Logstash.
  1. My data looks like this(in an xml file)
    < Header Name="test" Description="this is a test value">
    < Parameter Name="im a component" Description="hello">
    < DifferentParameter="blue" type="saa" ShortDescription="im a colour" another="12022" >
    < size>i am a size< /size>
    < /DifferentParameter>
    < DifferentParameter="red" type="saaaaaa" ShortDescription="i too am a colour" another="120122">
    < size>i am a size too< /size>
    < /DifferentParameter>
    < /Parameter>
    < Header>

I will get multiple Headers, which inside will have multiple Parameters, which inside will have multiple DifferentParameters.
In my input in Logstash, I have multiline configured, so its like this:
codec => multiline {
pattern => "<(Header|Parameter|DifferentParameter)"
negate => true
what => "previous"

2)In Output of Logstash I get something like this:
{
HeaderName=> values
HeaderDescription => values
}
{
ParameterName=> values
ParameterDescription => value
}
{
DifferentParameter=> "blue"
type=> "saa"
ShortDescription=> "im a colour"
another=> "12022"
size=> i am a size
}
{
DifferentParameter="red"
type="saaaaaa"
ShortDescription="i too am a colour"
another="120122"
size =>i am a size too
}
3) What I would like to have is for all DifferentParameter events to include the Header + Parameter, in which DifferentParameter was nested in, so it would look like:
{
HeaderName=> values
ParameterName=> values
DifferentParameter=> "blue"
type=> "saa"
ShortDescription=> "im a colour"
another=> "12022"
size=> i am a size
}

I want the Header + Parameter to be saved as a variable, and be added to each event of DifferentParameter. It would need to remember the current Header + Parameter values and add them as fields.

I want to add a field, such as below, but need to retreive it from the last event where Parameter was present.
mutate{
add_field => { Parameter => "%{[Parameter]}" }
}
mutate{
add_field => { DifferentParameter => "%{[DifferentParameter]}" }
}

Change your multiline codec so you'll read the whole file in one swoop. Right now it appears you're reading it element by element.

If I change the multi-line codec, to be reading all data in one go, wont I have a problem that my result will be placed in 1 big JSON, where I will have all HeaderParameter(multiple), ParameterName(multiple) and Different Parameter(multiple) in one result? Will I be able to distinguish that DifferentParameter, is part of particular ParameterName and HeaderName?

Can you give an example how to tweak multi-line codec?

If I change the multi-line codec, to be reading all data in one go, wont I have a problem that my result will be placed in 1 big JSON, where I will have all HeaderParameter(multiple), ParameterName(multiple) and Different Parameter(multiple) in one result?

Not if you use filters to split up the big document as desired. You'll probably have to use a ruby filter followed by a split filter. Use the ruby filter to build a structure like this:

{
  ...
  "whatever": [
    {
      "HeaderName":  values,
      "ParameterName": values,
      "DifferentParameter": "blue"
    },
    {
      "HeaderName":  values,
      "ParameterName": values,
      "DifferentParameter": "red"
     }
  ]
}
...

Then ask the split filter to split on the whatever field.

Can you give an example how to tweak multi-line codec?

pattern => "expression that only matches the first line of each document"
negate => true
what => previous

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.