Extract string from JSON and put into new value

I am still fairly new to Logstash and I am starting to get on the tips of my skates when comes to understanding how to do what I need to.

I am receiving JSON from a device (via HTTP payload) and it is super easy to slap it into ElasticSearch that way. However, one of the fields has info in it that I want to parse out and generate a new field based on what is in it. I know how to use the grok parser if the whole thing is string (like syslog) but I can't figure out how to do it if it is in JSON.

Here is example output from what I have now:
`

    {
        "headers" => {
        "http_accept" => "*/*",
        "content_type" => "application/json",
        "request_path" => "/test-api",
        "http_version" => "HTTP/1.1",
        "request_method" => "POST",
        "http_host" => "192.168.86.140:9563",
        "request_uri" => "/test-api",
        "content_length" => "374"
    },
        "domain_id" => "Suspicious domain seen (domain.name:xiterzao.ddns.net)(654391)",
        "rule" => "domain_rule",
        "dst_ip" => "192.168.55.2",
        "domain_category" => "external",
        "tags" => [
            [0] "DNS"
        ],
        "src_ip" => "192.168.45.132",
         "processed" => "0",
         "device_name" => "MXVM",
         "@timestamp" => 2017-11-22T23:31:11.455Z,
         "received_at" => "2017-11-22T23:31:11.455Z",
         "@version" => "1",
         "host" => "192.168.86.122",
         "monitor_tag" => "",
         "msg_gen_time" => "2017/11/22 15:31:12"
    }

`

Here is the conf file I am using:

input {
    http {
        host => "192.168.86.140"
        port => '9563'
    }
}
filter {
    grok {
        match => {"message" => "%{GREEDYDATA:msg_body}"}
        add_field => [ "received_at", "%{@timestamp}" ]
        add_field => [ "processed", 0 ]
        add_tag => [ "DNS" ]
    }
    if "Suspicious domain seen" in ["domain_id"] {
        mutate {
                add_field => [ "it_worked", "True" ]
        }
   }
}

output {
    if "DNS" in [tags] {
        elasticsearch {
            hosts => ["192.168.86.140:9200"]
            index => ["dns"]
        }
    }
   stdout { codec => rubydebug }
}

I tried adding the if conditional in there to see if I could even grab the right thing but I don't get that in the output, so I guess I am looking at it in the wrong way. I tried with ["domain_id"] =~ "Suspicious domain seen" as well and I get the same result.

What I ultimately want to do is create a new field (domain) with the above "xiterzao.ddns.net" extracted from domain_id value and add that to the output as well because I need to pull it from ElasticSearch later and do a lookup in another application on it.

Don't know, maybe I have been just looking at this too long (all day) and am overthinking it. Thanks for any help.

   match => {"message" => "%{GREEDYDATA:msg_body}"}

If you want to copy or rename the message field just use a mutate filter. There's no reason to use grok here.

if "Suspicious domain seen" in ["domain_id"] {

Drop the quotes on both sides of "domain_id".

Thanks @magnusbaeck.

I took out the grok and tried using mutate and never got anything. Obviously I did something wrong, but until I get this other part working I am not even going to attempt to futz with that. So, I put it back in for now.

Removing the quotes fixed the conditional so now it creates the field. Now I just need to figure out how to extract the domain name from the domain_id line. I will keep working on it and come back if I can't get it to work (which I couldn't yesterday).

For application/json content type, the http input uses the json codec in the http input to decode the http JSON data into fields in the event. Therefore there will no message field for grok to operate on and because there is no match the add_field and add_tag will not be added (they are added on 'success' only).

You can use grok on the domain_id field though:

input {
    http {
        host => "192.168.86.140"
        port => '9563'
    }
}
filter {
    grok {
        match => {"message" => "%{GREEDYDATA:msg_body}"}
        add_field => [ "received_at", "%{@timestamp}" ]
        add_field => [ "processed", 0 ]
        add_tag => [ "DNS" ]
    }
    if [domain_id] =~ "^Suspicious domain seen" {
        mutate {
            match => {"domain_id" => "^Suspicious domain seen \(domain.name:%{HOSTNAME:[suspicious_domain]}\)\(%{NUMBER:[number]}\)"}
            add_field => [ "it_worked", "True" ]
        }
   }
}

output {
    if "DNS" in [tags] {
        elasticsearch {
            hosts => ["192.168.86.140:9200"]
            index => ["dns"]
        }
    }
   stdout { codec => rubydebug }
}

Thanks, but I get this error when I try using the match in the mutate:

[2017-11-23T09:44:10,515][ERROR][logstash.filters.mutate ] Unknown setting 'match' for mutate

I am running 6.x if that should make a difference.

As to the explanation on add_field and add_tag only added on success for message, they have been added every time. So, the outcome is contradictory. I don't understand what is supposed to be wrong with it. I am still learning but if something works, I don't want to start changing those things until I get the other parts working. If it's a best practice or something, I understand and will revisit it later when I have everything working the way I need it.

I got past the reference of the element by removing the quotes and now I want to do what you stated in the mutate->match part but I get the error above. It makes sense that it should work.

Thanks for the help.

Sorry my cut and paste was totally wrong, this is what I meant for the filter section:

filter {
    if [domain_id] =~ "^Suspicious domain seen" {
        grok {
            match => {"domain_id" => "^Suspicious domain seen \(domain.name:%{HOSTNAME:[suspicious_domain]}\)\(%{NUMBER:[number]}\)"}
            add_field => [ "received_at", "%{@timestamp}" ]
            add_field => [ "processed", 0 ]
            add_tag => [ "DNS" ]
        }
   }
}

@guyboertje @magnusbaeck
This is perfect and does exactly what I need it to. Thank you!!!

I really appreciate the patience you guys have and the help you have given. I am now more (dangerously) powerful and can show others that ElasticStack is what we need to use, rather than homegrown tools that attempt to do the same thing in archaic ways.

Keep your questions coming. Expanding the use of the Elastic stack is a bonus. Happy to have helped.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.