How to add optional json field in Logstash

Hi, I wrote a Logstash pipeline to modify some fields of json input files. The keys introduction.text, introduction.html and body in json are optional, I don't want to create empty fields in json output files. How to add checks in my pipeline?

input {
  file {
    mode => "read"
    path => "/documents/*" 
    start_position => "beginning"
    sincedb_path => "/dev/null"
    codec => multiline { pattern => "^Spalanzani" negate => true what => "previous" auto_flush_interval => 2 }
    file_chunk_size => 409600
  }
}  
filter {
  json {
    source => "message"
    id => "%{[message][id]}"
    add_field => {
      title => "%{[title][text]}"
      date => "%{[editorial_date]}"
      topic => "%{[topic][title]}"
      text => "%{[introduction][text]} %{[body]}"
      text_html => "%{[introduction][html]} %{[body]}"
    }
  }
...

As I couldn't write a valid pipeline with if condition, I tried gsub, but it didn't work. %{[introduction][text]} are in the output json file.

filter {
  json {
    source => "message"
    id => "%{[message][id]}"
    add_field => {
      title => "%{[title][text]}"
      date => "%{[editorial_date]}"
      text => "%{[introduction][text]} %{[body]}"
      text_html => "%{[introduction][html]} %{[body]}"
    }
  }
  mutate {
    gsub => [
      "texte", "<.*?>", " ",
      "texte", "(%{[introduction][text]}|%{[body]})", " ",
      "texte_html", "(%{[introduction][html]}|%{[body]})", " ",
      "date", "T.*$", ""
    ]
  }
}

If I escape [ with \ in gsub:

"texte", "(%{\[introduction]\[text]}|%{\[body]})", " ",

I get error:
Exception caught while applying mutate filter {:exception=>"Invalid FieldReference: \\[introduction][\\[text]](file://[text]%60)"}

here is a source json file:

{
  "id": "1111",
  "editorial_date": "2015-10-19T00:00:00Z",
    "title": {
    "text": "test title",
    "html": "<h1>test title</h1>"
  },
"body": "<video data-els-src-id=\"\"..."
}

Wait for a response, I found a way to add optional JSON fields. Does someone have a better solution?

filter {
  json {
    source => "message"
    id => "%{[message][id]}"
    add_field => {
      title => "%{[title][text]}"
      date => "%{[editorial_date]}"
      topic => "%{[topic][title]}"
      intro_text => "%{[introduction][text]}"
      intro_html => "%{[introduction][html]}"
      body => "%{[body]}"
    }
  }
  if [intro_text] != "%{[introduction][text]}" {
    if [body] != "%{[body]}" {
      mutate {
        add_field => { "texte" => "%{[intro_text]} %{[body]}" }
      }
    } else {
      mutate {
        add_field => { "texte" => "%{[intro_text]}" }
      }
    }
  } else if [body] != "%{[body]}" {
    mutate {
      add_field => { "texte" => "%{[body]}" }
    }
  }
  if [intro_html] != "%{[introduction][html]}" {
    if [body] != "%{[body]}" {
      mutate {
        add_field => { "texte_html" => "%{[intro_html]} %{[body]}" }
      }
    } else {
      mutate {
        add_field => { "texte_html" => "%{[intro_html]}" }
      }
    }
  } else if [body] != "%{[body]}" {
    mutate {
      add_field => { "texte_html" => "%{[body]}" }
    }
  }
}
...

If the keys are optional and when the keys are not contained in the json input file, no fields for the keys might be created.

What is the output with empty fields? Could you share sample output using rubydebug output plugin?

Anyway, using ruby filter could be good for you.

With filter.json.add_field, I got some fields which contain only %{[xxx]}
For example, if the field introduction isn't in json, the output (stdout) is:

      intro_text => "%{[introduction][text]}"
      intro_html => "%{[introduction][html]}"

@Tomo_M Thank your for your advise, now I use Ingest pipeline for the json transformation, it's easier.

    {
      "set": {
        "field": "intro_html",
        "value": "{{introduction.html}}",
        "ignore_missing": true
      }
    }
1 Like

I see those fields were created by the "add_field" block. thanks. I'm glad to hear you find the way.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.