Parse text value from message but keep original text in message


(Bill Youngman) #1

Good Afternoon All,

I am currently using the Logstash Grok filter to parse out values from a log entry and populate index fields with the values and it is working like a champ but now I want to parse out certain values from the log input to use in new index fields but keep the original value in the input string.

Is there a way to use Grok parsing to accomplish this or is there another Logstash filter that will provide me the functionality I desire?

TIA,
Bill Youngman


#2

grok does not modify its input, so this should be what you are getting now. Can you show what your input line looks like, what your config is and explain what you want to be different?


(Bill Youngman) #3

This is going to be rather lengthy but here goes - removing specific information concerning our environment...

This is the original log message...

$$ START $$
Time stamp: 18/02/23 13:20:06.058
Log level: OPS
Thread name: main priority: 5
Class: 'java class name'

Application Name: Data Extract
Application Version: 8.0.1.1.3
Partition: System
vertex_common: 8.0.1.1.1
vertex_util: 8.0.1.0.3
vertex_taxgis: 8.0.1.1.1
vertex_tps_calc_impl: 8.0.1.1.6
vertex_tps_ccc_impl: 8.0.1.1.1
Vertex Root: ..
Configuration File: file:/xx/xxxxxx/xxxxx/xxx/../config/xxx.cfg
Operating System: Windows Server 2012 R2 6.3 (amd64)
Java Home: xxx
JVM: Oracle Corporation 1.8.0_121
Max Heap Size: xxx

This is my grok filter--

	grok {
		match => { "message" => "Time stamp:\s%{DATESTAMP:logTime}.*Log level:\s%{WORD:logLevel}.*\sThread name:\s%{GREEDYDATA:logThread}.*\sClass:\s%{JAVACLASS:logClass}(.*Exception classification:\s%{WORD:logExceptionClassification})?.*?\n\s%{GREEDYDATA:logMessage}" }
	}

Here is the parsed outputs...

logClass: 'java class name'
logLevel: OPS
logThread: main priority: 5
logMessage:
Application Name: Data Extract
Application Version: 8.0.1.1.3
Partition: System
vertex_common: 8.0.1.1.1
vertex_util: 8.0.1.0.3
vertex_taxgis: 8.0.1.1.1
vertex_tps_calc_impl: 8.0.1.1.6
vertex_tps_ccc_impl: 8.0.1.1.1
Vertex Root: ..
Configuration File: file:/xx/xxxxxx/xxxxx/xxx/../config/xxx.cfg
Operating System: Windows Server 2012 R2 6.3 (amd64)
Java Home: xxx
JVM: Oracle Corporation 1.8.0_121
Max Heap Size: xxx

What I would like to do is parse out 'Partition' & 'Application Name' into 2 new index fields but also leave them in the logMessage field so it would be like this...

logApplicationName: Data Extract
logPartition: System
logMessage:
Application Name: Data Extract
Application Version: 8.0.1.1.3
Partition: System
vertex_common: 8.0.1.1.1
vertex_util: 8.0.1.0.3
vertex_taxgis: 8.0.1.1.1
vertex_tps_calc_impl: 8.0.1.1.6
vertex_tps_ccc_impl: 8.0.1.1.1
Vertex Root: ..
Configuration File: file:/xx/xxxxxx/xxxxx/xxx/../config/xxx.cfg
Operating System: Windows Server 2012 R2 6.3 (amd64)
Java Home: xxx
JVM: Oracle Corporation 1.8.0_121
Max Heap Size: xxx

Let know if this isn't descriptive enough.

Also the '$$ START $$' log entry is being filtered out by Filebeat.

Thanks,
Bill


#4

Oh, absolutely, you can just use a second grok filter. You have a single multi-line "message" from which you have groked logMessage, so you can just grok that field. In case the order of lines every changes I would anchor on the field name and then grab zero or more not-newline followed by a newline...

grok {
  match => { "logMessage" => "cation Name:\s(?<logApplicationName>[^
]*)
" }
}

It might be possible to do this in a single grok using nested capture groups, but that is going to be much harder to read even if it works.


(system) #5

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.