platso588
(Leela Prasad Gorrepati)
December 5, 2017, 12:02pm
1
Hi All,
I have merged few Yarn logs in elasticsearch and in Kibana.
source=/data06/yarn/logs/application_1510757389739_0634/container_e15_1510757554319_0034_01_000001/hive2kafka-export29.log
From source, I want to extract 'application_1510757389739_0634' and add it to a new field applicationID.
I have created below filter in logstash:
if ([fields][log_type] == "yarnHive2kafkaLog") {
grok {
match => { "message" => "%{YEAR:logYear}-%{MONTHNUM:logMonth}-%{MONTHDAY:logDate} %{TIME:logTime} !%{SPACE}%{LOGLEVEL:logLevel}%{SPACE}! %{GREEDYDATA:message}"}
}
mutate {
split => ["source", "/"]
add_field => { "applicationID" => "%{source[4]}" }
}
}
I could extract 'application_1510757389739_0634', however source field went for a toss and became a CSV value,
, data06, yarn, logs, application_1510757389739_0634, container_e15_1510757327319_0034_01_000001, hive2kafka-export30.log
Below Screenshot shows this.
How do I tackle with this?
Thanks in Advance.
You asked the mutate filter to split source
into an array and that's what it did.
There are many ways to solve this, including making a copy of source
before you split it or using a grok filter to extract the correct directory component.
platso588
(Leela Prasad Gorrepati)
December 5, 2017, 1:01pm
3
Thanks Magnus.
I tried the suggested 2 ways:
1. copy of source by creating a temp variable as below,
if ([fields][log_type] == "yarnHive2kafkaLog") {
grok {
match => { "message" => "%{YEAR:logYear}-%{MONTHNUM:logMonth}-%{MONTHDAY:logDate} %{TIME:logTime} !%{SPACE}%{LOGLEVEL:logLevel}%{SPACE}! %{GREEDYDATA:message}"}
}
mutate {
copy => { "source" => "source_tmp" }
split => ["source_tmp", "/"]
add_field => { "applicationID" => "%{source_tmp[3]}" }
}
}
I could not get EXTRACTED applicationID. Output is as below:
2. grok filter on source
snippet:
if ([fields][log_type] == "yarnHive2kafkaLog") {
grok {
match => { "message" => "%{YEAR:logYear}-%{MONTHNUM:logMonth}-%{MONTHDAY:logDate} %{TIME:logTime} !%{SPACE}%{LOGLEVEL:logLevel}%{SPACE}! %{GREEDYDATA:message}"}
match => { "source" => "/%{GREEDYDATA:primaryDir}/%{GREEDYDATA:subDir1}/%{GREEDYDATA:subDir2}/%{GREEDYDATA:subDir3}/%{GREEDYDATA:containerID}/%{GREEDYDATA:fileName}"}
}
mutate {
add_field => { "applicationID" => "%{subDir3}" }
}
}
I have same Issue in this as well applicationID has value as %{subDir3}.
Could you please correct me on this?
mutate {
copy => { "source" => "source_tmp" }
split => ["source_tmp", "/"]
add_field => { "applicationID" => "%{source_tmp[3]}" }
}
split
applies before copy
. You need to use at least two mutate filters after one another.
Similar thing with the grok filter solution. The grok filter breaks after the first match so you need to match against source
in a separate grok filter.
platso588
(Leela Prasad Gorrepati)
December 5, 2017, 1:37pm
5
This resolved the Issue and Thanks a ton
Working code snippets for both the approaches.
1. copy of source by creating a temp variable,
if ([fields][log_type] == "yarnHive2kafkaLog") {
grok {
match => { "message" => "%{YEAR:logYear}-%{MONTHNUM:logMonth}-%{MONTHDAY:logDate} %{TIME:logTime} \!%{SPACE}%{LOGLEVEL:logLevel}%{SPACE}\! %{GREEDYDATA:message}"}
}
mutate {
copy => { "source" => "source_tmp" }
}
mutate {
split => ["source_tmp", "/"]
add_field => { "applicationID" => "%{source_tmp[4]}" }
}
}
2. grok filter on source
if ([fields][log_type] == "yarnHive2kafkaLog") {
grok {
match => { "message" => "%{YEAR:logYear}-%{MONTHNUM:logMonth}-%{MONTHDAY:logDate} %{TIME:logTime} \!%{SPACE}%{LOGLEVEL:logLevel}%{SPACE}\! %{GREEDYDATA:message}"}
}
grok {
match => { "source" => "/%{GREEDYDATA:primaryDir}/%{GREEDYDATA:subDir1}/%{GREEDYDATA:subDir2}/%{GREEDYDATA:subDir3}/%{GREEDYDATA:containerID}/%{GREEDYDATA:fileName}"}
}
mutate {
add_field => { "applicationID" => "%{subDir3}" }
}
}
1 Like
system
(system)
Closed
January 2, 2018, 1:38pm
6
This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.