Nested field in a grok filter

Hi,

I have a pattern for log which is like this :
inactive: No, sources : [{id: 1, type: fr, name:custom}, {id:2, type:fr, name: random}], assigned: Yes

I wrote this custom grok filter :
inactive: %{WORD:inactive}, sources : [id: %{NUMBER:id}, type: %{WORD:fr}, name: {WORD:name}], assigned: %{WORD:assigned}

But the problem is that I want a nested structure like
source1 :
id1:
type1:
name1:

source2:
etc

I tried %{NUMBER:[source][id] for example but it doesn't work.

Any help ?
Thank you.

    grok { match => { "message" => "^inactive: %{WORD:inactive}, sources : \[(?<[@metadata][sources]>[^\]]+)\], assigned: %{WORD:assigned}" } }
    ruby {
        code => '
            matches = event.get("[@metadata][sources]").scan(/{id:\s*([0-9]+), type:\s*([a-zA-Z0-9]+), name:\s*([a-zA-Z0-9]+)}/)
            event.set("matches", matches)
        '
    }

will get you to

   "matches" => [
    [0] [
        [0] "1",
        [1] "fr",
        [2] "custom"
    ],
    [1] [
        [0] "2",
        [1] "fr",
        [2] "random"
    ]
]

It is unclear what structure you want that data in, so you might be able to move stuff around using mutate, or you may need ruby.

Hello,

Thanks for your help. Unfortunately, this doesn't work. Logstash failed to restart because of invalid syntax in the ruby filter. That's it :

ruby {
        	code => '
		matches = event.get("[@metadata][log_sources]").scan(/{u'id': ([0-9]+), u'type_name': u'([a-zA-Z0-9\D\s]+)', u'name': u'([a-zA-Z0-9\D\s]+)', u'type_id': ([0-9]+)/})
		event.set("matches", matches)
		'
	}

The payload in the data is :
{u'id': 50, u'type_name': u'Audit', u'name': u'Audit @ 10.10.10.10', u'type_id': 38}, {u'id': 55, u'type_name': u'Audit', u'name': u'Audit @ 10.10.10.10', u'type_id': 41},

I put \[(?<[@metadata][sources]>[^\]]+)\] in the grok filter and it seems good if I don't write the ruby filter.

I also tried these :

ruby {
    	code => ' 
	matches = event.get("[@metadata][log_sources]").scan(\{u\'id\': ([0-9]+), u\'type_name\': u\'([a-zA-Z0-9\D\s]+)\', u\'name\': u\'([a-zA-Z0-9\D\s]+)\', u\'type_id\': ([0-9]+)\})
	event.set("matches", matches)
	'
}


ruby {
        	code => ' 
		matches = event.get("[@metadata][log_sources]").scan(/{u\'id\': ([0-9]+), u\'type_name\': u\'([a-zA-Z0-9\D\s]+)\', u\'name\': u\'([a-zA-Z0-9\D\s]+)\', u\'type_id\': ([0-9]+)/})
		event.set("matches", matches)
		'
	}


ruby {
    	code => " 
		matches = event.get('[@metadata][log_sources]').scan(/{u'id': ([0-9]+), u'type_name': u'([a-zA-Z0-9\D\s]+)', u'name': u'([a-zA-Z0-9\D\s]+)', u'type_id': ([0-9]+)/})
		event.set('matches', matches)
		"
	}

There are a couple of problems with this. scan takes a regexp, which is delimited by /. At the end of the call to scan you have

u'type_id': ([0-9]+)/})

which should be

u'type_id': ([0-9]+)}/)

The other problem is that by adding \D\s in order to capture the IP address, you actually cause the regexp to get really grabby and capture almost the entire string.

Provided that the name of the field in the grok is consistent with the name in the ruby code ([@metadata][log_sources] vs. [@metadata][sources]) then this

ruby {
    code => "
        matches = event.get('[@metadata][log_sources]').scan(/{u'id': ([0-9]+), u'type_name': u'([^']+)', u'name': u'([^']+)', u'type_id': ([0-9]+)}/)
        event.set('matches', matches)
    "
}

will get you

   "matches" => [
    [0] [
        [0] "50",
        [1] "Audit",
        [2] "Audit @ 10.10.10.10",
        [3] "38"
    ],
    [1] [
        [0] "55",
        [1] "Audit",
        [2] "Audit @ 10.10.10.10",
        [3] "41"
    ]
]

Wonderful ! Thank you very much ! It sounds clear now !

So I try that, and logstash start but I have these error in logs, and I don't see the fields in kibana :

Jul 17 16:13:53 localhost logstash: [2019-07-17T16:13:53,620][ERROR][logstash.filters.ruby ] Ruby exception occurred: undefined method `scan' for nil:NilClass

That means that event.get did not return anything. You could try

code => "
    log_sources = event.get('[@metadata][log_sources]')
    if log_sources
        matches = log_sources.scan(/{u'id': ([0-9]+), u'type_name': u'([^']+)', u'name': u'([^']+)', u'type_id': ([0-9]+)}/)
        event.set('matches', matches)
    end
"

Everything seems right ! Thank you very very much Badger :smiley:

A last little question , I have other part in my log file which have also nested patterns.
Do I have to add the other ones after the code => " like this ?

code => "
    log_sources = event.get('[@metadata][log_sources]')
    if log_sources
        matches = log_sources.scan(/{u'id': ([0-9]+), u'type_name': u'([^']+)', u'name': u'([^']+)', u'type_id': ([0-9]+)}/)
        event.set('matches', matches)
    end

    rules = event.get ...

"

Edit : Another last question, if for one line of log, there is just one id, type_name, name, and type_id in [log_sources], is it recorded in kibana ? If not, how can we do this ?

Yes, you can do the same type of treatment for more than one type of log line.

scan should return an array with a single member if there is only one match in log_sources, so it should show up in Kibana.

Thank you Badger :smiley:

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.