Nested field in a grok filter

theo1991 · July 16, 2019, 2:37pm

Hi,

I have a pattern for log which is like this :
inactive: No, sources : [{id: 1, type: fr, name:custom}, {id:2, type:fr, name: random}], assigned: Yes

I wrote this custom grok filter :
inactive: %{WORD:inactive}, sources : [id: %{NUMBER:id}, type: %{WORD:fr}, name: {WORD:name}], assigned: %{WORD:assigned}

But the problem is that I want a nested structure like
source1 :
id1:
type1:
name1:

source2:
etc

I tried %{NUMBER:[source][id] for example but it doesn't work.

Any help ?
Thank you.

Badger · July 16, 2019, 3:05pm

    grok { match => { "message" => "^inactive: %{WORD:inactive}, sources : \[(?<[@metadata][sources]>[^\]]+)\], assigned: %{WORD:assigned}" } }
    ruby {
        code => '
            matches = event.get("[@metadata][sources]").scan(/{id:\s*([0-9]+), type:\s*([a-zA-Z0-9]+), name:\s*([a-zA-Z0-9]+)}/)
            event.set("matches", matches)
        '
    }

will get you to

   "matches" => [
    [0] [
        [0] "1",
        [1] "fr",
        [2] "custom"
    ],
    [1] [
        [0] "2",
        [1] "fr",
        [2] "random"
    ]
]

It is unclear what structure you want that data in, so you might be able to move stuff around using mutate, or you may need ruby.

theo1991 · July 17, 2019, 10:13am

Hello,

Thanks for your help. Unfortunately, this doesn't work. Logstash failed to restart because of invalid syntax in the ruby filter. That's it :

ruby {
        	code => '
		matches = event.get("[@metadata][log_sources]").scan(/{u'id': ([0-9]+), u'type_name': u'([a-zA-Z0-9\D\s]+)', u'name': u'([a-zA-Z0-9\D\s]+)', u'type_id': ([0-9]+)/})
		event.set("matches", matches)
		'
	}

The payload in the data is :
{u'id': 50, u'type_name': u'Audit', u'name': u'Audit @ 10.10.10.10', u'type_id': 38}, {u'id': 55, u'type_name': u'Audit', u'name': u'Audit @ 10.10.10.10', u'type_id': 41},

I put \[(?<[@metadata][sources]>[^\]]+)\] in the grok filter and it seems good if I don't write the ruby filter.

theo1991 · July 17, 2019, 12:58pm

I also tried these :

ruby {
    	code => ' 
	matches = event.get("[@metadata][log_sources]").scan(\{u\'id\': ([0-9]+), u\'type_name\': u\'([a-zA-Z0-9\D\s]+)\', u\'name\': u\'([a-zA-Z0-9\D\s]+)\', u\'type_id\': ([0-9]+)\})
	event.set("matches", matches)
	'
}


ruby {
        	code => ' 
		matches = event.get("[@metadata][log_sources]").scan(/{u\'id\': ([0-9]+), u\'type_name\': u\'([a-zA-Z0-9\D\s]+)\', u\'name\': u\'([a-zA-Z0-9\D\s]+)\', u\'type_id\': ([0-9]+)/})
		event.set("matches", matches)
		'
	}


ruby {
    	code => " 
		matches = event.get('[@metadata][log_sources]').scan(/{u'id': ([0-9]+), u'type_name': u'([a-zA-Z0-9\D\s]+)', u'name': u'([a-zA-Z0-9\D\s]+)', u'type_id': ([0-9]+)/})
		event.set('matches', matches)
		"
	}

Badger · July 17, 2019, 2:10pm

There are a couple of problems with this. scan takes a regexp, which is delimited by /. At the end of the call to scan you have

u'type_id': ([0-9]+)/})

which should be

u'type_id': ([0-9]+)}/)

The other problem is that by adding \D\s in order to capture the IP address, you actually cause the regexp to get really grabby and capture almost the entire string.

Provided that the name of the field in the grok is consistent with the name in the ruby code ([@metadata][log_sources] vs. [@metadata][sources]) then this

ruby {
    code => "
        matches = event.get('[@metadata][log_sources]').scan(/{u'id': ([0-9]+), u'type_name': u'([^']+)', u'name': u'([^']+)', u'type_id': ([0-9]+)}/)
        event.set('matches', matches)
    "
}

will get you

   "matches" => [
    [0] [
        [0] "50",
        [1] "Audit",
        [2] "Audit @ 10.10.10.10",
        [3] "38"
    ],
    [1] [
        [0] "55",
        [1] "Audit",
        [2] "Audit @ 10.10.10.10",
        [3] "41"
    ]
]

theo1991 · July 17, 2019, 2:17pm

Wonderful ! Thank you very much ! It sounds clear now !

So I try that, and logstash start but I have these error in logs, and I don't see the fields in kibana :

Jul 17 16:13:53 localhost logstash: [2019-07-17T16:13:53,620][ERROR][logstash.filters.ruby ] Ruby exception occurred: undefined method `scan' for nil:NilClass

Badger · July 17, 2019, 2:22pm

That means that event.get did not return anything. You could try

code => "
    log_sources = event.get('[@metadata][log_sources]')
    if log_sources
        matches = log_sources.scan(/{u'id': ([0-9]+), u'type_name': u'([^']+)', u'name': u'([^']+)', u'type_id': ([0-9]+)}/)
        event.set('matches', matches)
    end
"

theo1991 · July 17, 2019, 3:10pm

Everything seems right ! Thank you very very much Badger

A last little question , I have other part in my log file which have also nested patterns.
Do I have to add the other ones after the code => " like this ?

code => "
    log_sources = event.get('[@metadata][log_sources]')
    if log_sources
        matches = log_sources.scan(/{u'id': ([0-9]+), u'type_name': u'([^']+)', u'name': u'([^']+)', u'type_id': ([0-9]+)}/)
        event.set('matches', matches)
    end

    rules = event.get ...

"

Edit : Another last question, if for one line of log, there is just one id, type_name, name, and type_id in [log_sources], is it recorded in kibana ? If not, how can we do this ?

Badger · July 17, 2019, 3:22pm

Yes, you can do the same type of treatment for more than one type of log line.

scan should return an array with a single member if there is only one match in log_sources, so it should show up in Kibana.

theo1991 · July 18, 2019, 7:21am

Thank you Badger

system · August 15, 2019, 7:27am

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
How to dynamic parse log via grok regex? Logstash	7	4697	July 6, 2017
Can I use grok filter on a nested filed Logstash	2	371	July 6, 2017
Logstash nested array parsing Logstash	3	1285	July 18, 2017
I would like to access nested data from an array Logstash	7	7635	February 15, 2017
Access nested logstash event key in ruby Logstash	2	1405	March 7, 2019

Nested field in a grok filter

Related topics