How to grok field appearing multiple times in a log line

Hi ,

I have a syslog coming from cisco ISE which has multiple entry of "Step" in one line .

Like
++++++++++++++++++++++
Step=11001, Step=11017, Step=15049, Step=15008, Step=15048, Step=15048, Step=15048, Step=15048, Step=11507, Step=12300, Step=11006, Step=11001, Step=11018, Step=12302, Step=12319, Step=12800, Step=12805, Step=12806, Step=12807, Step=12808, Step=12810, Step=12811, Step=12305, Step=11006, Step=11001, Step=11018, Step=12304, Step=12305, Step=11006, Step=11001, Step=11018, Step=12304, Step=12305, Step=11006, Step=11001, Step=11018, Step=12304, Step=12305, Step=11006, Step=11001, Step=11018, Step=12304, Step=12305, Step=11006, Step=11001, Step=11018, Step=12304, Step=12319, Step=12812, Step=12813, Step=12804, Step=12801, Step=12802, Step=12816, Step=12310, Step=12305, Step=11006, Step=11001, Step=11018, Step=12304, Step=12313, Step=11521, Step=12305, Step=11006, Step=11001, Step=11018, Step=12304, Step=11522, Step=12606, Step=12611, Step=15041, Step=22072, Step=15013, Step=12606, Step=12305, Step=11006, Step=11001, Step=11018, Step=12304, Step=12610, Step=15041, Step=22072, Step=15013, Step=24031, Step=24015, Step=24020, Step=22057, Step=22061, Step=12610, Step=12611, Step=15041, Step=22072, Step=15013, Step=12610, Step=12305, Step=11006, Step=11001, Step=11018, Step=12304, Step=12610, Step=15041, Step=22072, Step=15013, Step=24031, Step=24015, Step=24020, Step=22057, Step=22061, Step=12610, Step=12623, Step=11520, Step=22028, Step=12305, Step=11006, Step=11001, Step=11018, Step=12304, Step=12917, Step=11500, Step=61025, Step=11504, Step=11003, Step=5434

+++++++++++++++++++++++++++++

Number of time "Step" occurs in a log line can vary .

I am not sure how best to address this while writing grok filter for situations like this .

Does anyone have suggestion?

Regards,
Prakash.

What's the desired outcome?

Grok is the wrong tool for this.

Thanks for your reply.

We are very new to elastic stack .

And we are trying to feed in our Cisco ISE logs in logstash , and write filters to make sense of these logs.

Cisco ISE , send 7 to 8 different type of log messages , and I am writing those many grok match patterns.
May be there is more simpler way than this approach for writing a "grok match" for every kind of logs ?

For above situation , Outcome I would prefer is , filter all Step in one Step filed .

Or do you reckon any better solution ?

What do these step numbers represent? How do you want to process them?

I believe they may be latency information representing part of the authentication or authorisation process, in milliseconds.

Handy information to determine if at a particular stage the latency spikes ie in this snippet average might be 12000ms and we see in step 3 to step 8 it’s over 15000ms.

As said , It is just part of a message , which repeats uncertainly number of times in Cisco ISE logs.

Okay, but how do the numbers in a particular message relate? Are they connected to each other in some way or should they be considered independent?

What I'm trying to understand is whether a message like the one above should be split into one document per step number or if all step numbers in a message should be put in an array in a single message.

I understand what you are saying , But currently We do not have plan to how to use those values.

Currently I am looking a solution to aggregate all these values in one field .

And in future if we think these needs to be separated by lets say Step1 Step2 .. Stepn..

Okay. Use a mutate filter's split option to split

Step=1, Step=2, ...

into an array,

["Step=1", "Step=2", ...]

then use a mutate filter's gsub option to remove the Step= prefix from each element.

1 Like

Thanks ,

I have one more question for you .

Lets say I get 3-4 unique kind of messages ..

like ... ( every field is separated with space in each log line)

DATESTAMP A B C D E F
DATESTAMP A B G H E F
DATESTAMP A B I J E F

for this my filter will look like ,

match => { "message" => ["%{SYSLOGTIMESTAMP:timestamp} %{DATA:A} %{DATA:B} %{DATA:C} %{DATA:D} %{DATA:E} %{DATA:F} ,","%{SYSLOGTIMESTAMP:timestamp} %{DATA:A} %{DATA:B} %{DATA:G} %{DATA:H} %{DATA:E} %{DATA:F},","%{SYSLOGTIMESTAMP:timestamp} %{DATA:A} %{DATA:B} %{DATA:I} %{DATA:J} %{DATA:E} %{DATA:F},"]

Given fields A , B and E, F are common in all log lines , Is there a way to avoid writing these field multiple times in the match line ?

I hope you got what Is my question .

Is that really what the expressions look like (except the field names)? Because all input lines will be matched by the first expression.

that was just example....

It looks like this

Aug 31 16:10:15 XXX-ise-01 CISE_RADIUS_Accounting 0027477611 1 0 2018-08-31 16:10:15.462 +08:00 1078820660 3002 NOTICE Radius-Accounting: RADIUS Accounting watchdog update, ConfigVersionId=1259, Device IP Address=XXXXXXXXX, RequestLatency=4, NetworkDeviceName=XXXXXXX, User-Name=T8057155

And like this , there is different kind of logs

RADIUS Accounting watchdog update
RADIUS Accounting start reque

And then

CISE_Passed_Authentications
CISE_RADIUS_Accounting

So in all these logs , most of the fields are identical , and few differs. Thats why I have to write those number of matches . And it all look little messy .

I was thinking is there a way to take out the common fields , and then use the unique ones.

I am not sure if I am able to convey myself correctly , but i will try . :smile: . Appreciate your help though.

You have a couple of options that might help:

  • Define custom grok patterns.
  • Use two grok filters; one that extract common pieces and saves the rest in another field that's processed by a second grok filter.

Thats a great suggestion.

Do you have any example or documentation on that . It will be really helpful and bring me to the speed.

Do you have any example or documentation on that

Which of the suggestions are you talking about?

These ones

The grok filter documentation describes at length how to define custom patterns.

As for two grok filters it could look like this:

grok {
  match => {
    "message" => "^%{SOME-TIMESTAMP-PATTERN:timestamp} %{GREEDYDATA:rest}"
  }
}
grok {
  match => {
    "rest" => "..."
  }
}

The first filter obviously isn't limited to the timestamp; it's up to you how you want to divide the responsibility between the filters.

2 Likes

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.