How to grok field appearing multiple times in a log line

Prakash_Sharma · August 28, 2018, 8:04am

Hi ,

I have a syslog coming from cisco ISE which has multiple entry of "Step" in one line .

Like
++++++++++++++++++++++
Step=11001, Step=11017, Step=15049, Step=15008, Step=15048, Step=15048, Step=15048, Step=15048, Step=11507, Step=12300, Step=11006, Step=11001, Step=11018, Step=12302, Step=12319, Step=12800, Step=12805, Step=12806, Step=12807, Step=12808, Step=12810, Step=12811, Step=12305, Step=11006, Step=11001, Step=11018, Step=12304, Step=12305, Step=11006, Step=11001, Step=11018, Step=12304, Step=12305, Step=11006, Step=11001, Step=11018, Step=12304, Step=12305, Step=11006, Step=11001, Step=11018, Step=12304, Step=12305, Step=11006, Step=11001, Step=11018, Step=12304, Step=12319, Step=12812, Step=12813, Step=12804, Step=12801, Step=12802, Step=12816, Step=12310, Step=12305, Step=11006, Step=11001, Step=11018, Step=12304, Step=12313, Step=11521, Step=12305, Step=11006, Step=11001, Step=11018, Step=12304, Step=11522, Step=12606, Step=12611, Step=15041, Step=22072, Step=15013, Step=12606, Step=12305, Step=11006, Step=11001, Step=11018, Step=12304, Step=12610, Step=15041, Step=22072, Step=15013, Step=24031, Step=24015, Step=24020, Step=22057, Step=22061, Step=12610, Step=12611, Step=15041, Step=22072, Step=15013, Step=12610, Step=12305, Step=11006, Step=11001, Step=11018, Step=12304, Step=12610, Step=15041, Step=22072, Step=15013, Step=24031, Step=24015, Step=24020, Step=22057, Step=22061, Step=12610, Step=12623, Step=11520, Step=22028, Step=12305, Step=11006, Step=11001, Step=11018, Step=12304, Step=12917, Step=11500, Step=61025, Step=11504, Step=11003, Step=5434

+++++++++++++++++++++++++++++

Number of time "Step" occurs in a log line can vary .

I am not sure how best to address this while writing grok filter for situations like this .

Does anyone have suggestion?

Regards,
Prakash.

magnusbaeck · August 28, 2018, 8:40am

What's the desired outcome?

Grok is the wrong tool for this.

Prakash_Sharma · August 28, 2018, 9:01am

Thanks for your reply.

We are very new to elastic stack .

And we are trying to feed in our Cisco ISE logs in logstash , and write filters to make sense of these logs.

Cisco ISE , send 7 to 8 different type of log messages , and I am writing those many grok match patterns.
May be there is more simpler way than this approach for writing a "grok match" for every kind of logs ?

For above situation , Outcome I would prefer is , filter all Step in one Step filed .

Or do you reckon any better solution ?

magnusbaeck · August 28, 2018, 9:24am

What do these step numbers represent? How do you want to process them?

Prakash_Sharma · August 30, 2018, 5:55am

I believe they may be latency information representing part of the authentication or authorisation process, in milliseconds.

Handy information to determine if at a particular stage the latency spikes ie in this snippet average might be 12000ms and we see in step 3 to step 8 it’s over 15000ms.

As said , It is just part of a message , which repeats uncertainly number of times in Cisco ISE logs.

magnusbaeck · August 30, 2018, 5:59am

Okay, but how do the numbers in a particular message relate? Are they connected to each other in some way or should they be considered independent?

What I'm trying to understand is whether a message like the one above should be split into one document per step number or if all step numbers in a message should be put in an array in a single message.

Prakash_Sharma · August 30, 2018, 6:22am

I understand what you are saying , But currently We do not have plan to how to use those values.

Currently I am looking a solution to aggregate all these values in one field .

And in future if we think these needs to be separated by lets say Step1 Step2 .. Stepn..

magnusbaeck · August 30, 2018, 6:41am

Okay. Use a mutate filter's split option to split

Step=1, Step=2, ...

into an array,

["Step=1", "Step=2", ...]

then use a mutate filter's gsub option to remove the Step= prefix from each element.

Prakash_Sharma · August 31, 2018, 8:02am

Thanks ,

I have one more question for you .

Lets say I get 3-4 unique kind of messages ..

like ... ( every field is separated with space in each log line)

DATESTAMP A B C D E F
DATESTAMP A B G H E F
DATESTAMP A B I J E F

for this my filter will look like ,

match => { "message" => ["%{SYSLOGTIMESTAMP:timestamp} %{DATA:A} %{DATA:B} %{DATA:C} %{DATA:D} %{DATA:E} %{DATA:F} ,","%{SYSLOGTIMESTAMP:timestamp} %{DATA:A} %{DATA:B} %{DATA:G} %{DATA:H} %{DATA:E} %{DATA:F},","%{SYSLOGTIMESTAMP:timestamp} %{DATA:A} %{DATA:B} %{DATA:I} %{DATA:J} %{DATA:E} %{DATA:F},"]

Given fields A , B and E, F are common in all log lines , Is there a way to avoid writing these field multiple times in the match line ?

I hope you got what Is my question .

magnusbaeck · August 31, 2018, 8:07am

Is that really what the expressions look like (except the field names)? Because all input lines will be matched by the first expression.

Prakash_Sharma · August 31, 2018, 8:16am

that was just example....

It looks like this

Aug 31 16:10:15 XXX-ise-01 CISE_RADIUS_Accounting 0027477611 1 0 2018-08-31 16:10:15.462 +08:00 1078820660 3002 NOTICE Radius-Accounting: RADIUS Accounting watchdog update, ConfigVersionId=1259, Device IP Address=XXXXXXXXX, RequestLatency=4, NetworkDeviceName=XXXXXXX, User-Name=T8057155

And like this , there is different kind of logs

RADIUS Accounting watchdog update
RADIUS Accounting start reque

And then

CISE_Passed_Authentications
CISE_RADIUS_Accounting

So in all these logs , most of the fields are identical , and few differs. Thats why I have to write those number of matches . And it all look little messy .

I was thinking is there a way to take out the common fields , and then use the unique ones.

I am not sure if I am able to convey myself correctly , but i will try . . Appreciate your help though.

magnusbaeck · August 31, 2018, 9:02am

You have a couple of options that might help:

Define custom grok patterns.
Use two grok filters; one that extract common pieces and saves the rest in another field that's processed by a second grok filter.

Prakash_Sharma · September 4, 2018, 7:05am

Thats a great suggestion.

Do you have any example or documentation on that . It will be really helpful and bring me to the speed.

magnusbaeck · September 4, 2018, 7:06am

Do you have any example or documentation on that

Which of the suggestions are you talking about?

Prakash_Sharma · September 5, 2018, 3:27am

These ones

magnusbaeck · September 5, 2018, 8:54am

The grok filter documentation describes at length how to define custom patterns.

As for two grok filters it could look like this:

grok {
  match => {
    "message" => "^%{SOME-TIMESTAMP-PATTERN:timestamp} %{GREEDYDATA:rest}"
  }
}
grok {
  match => {
    "rest" => "..."
  }
}

The first filter obviously isn't limited to the timestamp; it's up to you how you want to divide the responsibility between the filters.

system · October 3, 2018, 8:54am

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Logstash parse multiple instances of the same word into Elasticsearch Array Elasticsearch	1	321	July 6, 2017
Duplicate field entries - multiple times even! Logstash	1	1287	July 6, 2017
Logstash Multiple Grok Add Field Logstash	3	1009	August 24, 2018
How to get multiple entries of similar pattern from the same input line using grok pattern? Logstash	9	1280	September 16, 2020
Logstash grok multiple pattern , multi-line Logstash	2	302	March 19, 2021

How to grok field appearing multiple times in a log line

Related topics