Logstash whitelist nestet json objects, filter only specific fields


(Petr Simik) #1

I cant filter out nestet json objects

I tried to implement network packets analysis.

The result has too-many fields and I want to filter out only messages having content "http_http_file_data" and import only specific fields.

I tried approach with prune or if statement but it always ignore the conditions and imports everything into index

I need to import only selected fields for instance (see example below)
and I need to import only the messages having http_http_file_data != null

thank you

`> input {
file {
path => "/home/user/pcap/packets3.json"
start_position => "beginning"
ignore_older => 0
}}

filter {
# Drop Elasticsearch Bulk API control lines
if ([message] =~ "{"index") {
drop {}
}

json {
    source => "message"
    remove_field => "message"
}

# Extract innermost network protocol
grok {
    match => {
        "[layers][frame][frame_frame_protocols]" => "%{WORD:protocol}$"

    }
}
#this prune does not work and this config = all is blacklisted, any condition is not matched
prune {
whitelist_names => [ "timestamp",
"[layers][http][http_http_file_data]",
"[layers][http][http_authorization_http_authbasic]",
"[layers][http][http_http_host]",
"[layers][http][http_http_request_full_uri]", 
"[layers][ip][ip_ip_addr]" ]
}

date {
    match => [ "timestamp", "UNIX_MS" ]
}

}

output {
elasticsearch {
hosts => ["127.0.0.1:9200"]
index => "user-pcap-test"
}
stdout { }
}
`

the JSON I simplified looks like this:
{"timestamp" : "1538279088714", "layers" : {"frame": {},"eth": {},"ip": {"ip_ip_src": "10.1.1.2"},"tcp": {"tcp_tcp_port": "34908"},"http": {"http_text": "HTTP/1.1 200 OK\r\n","http_http_file_data": "<?xml IremovedThisXmlData>"},"xml": {}}}


(Petr Simik) #2

I resolved the problem with mutate . Created parsed json and mutate add fields, (whitelist like) . Finally remove source and parsed object.

1) 
filter {
    #create parsed_json from input message
    json {
        source => "message"
        target => "parsed_json"
         }

2) 
mutate {
        add_field => {"timestamp" => "%{[parsed_json][timestamp]}"}
        add_field => {"http_data" => "%{[parsed_json][layers][http][http_http_file_data]}"}
        add_field => {"ip_src" => "%{[parsed_json][layers][ip][ip_ip_src_host]}"}
        add_field => {"ip_dst" => "%{[parsed_json][layers][ip][ip_ip_dst_host]}"}
        add_field => {"frame_time" => "%{[parsed_json][layers][frame][frame_frame_time]}"}
       
        remove_field => [ "json", "message" ]
        remove_field => [ "json", "parsed_json" ]
     }

(system) #3

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.