Hello,
I am currently building a pipeline to parse iostat data and send it to elasticsearch. I am using grok to find the lines I am interested. Some lines contain the timestamp, the other lines the device information for that timestamp.
#Sample Data
#
#Linux OSWbb v7.3.2
#zzz ***Thu Sep 22 08:00:07 BST 2016
#avg-cpu: %user %nice %system %iowait %steal %idle
# 11.86 0.00 3.67 14.67 0.00 69.81
#
#Device: rrqm/s wrqm/s r/s w/s rkB/s wkB/s avgrq-sz avgqu-sz await svctm %util
#xvda 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
#xvdb 0.00 11.00 0.00 121.00 0.00 528.00 8.73 0.23 1.90 0.10 1.20
#
#zzz ***Thu Sep 22 08:00:27 BST 2016
#avg-cpu: %user %nice %system %iowait %steal %idle
# 9.13 0.00 4.03 15.99 0.00 70.85
#
#Device: rrqm/s wrqm/s r/s w/s rkB/s wkB/s avgrq-sz avgqu-sz await svctm %util
#xvda 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
#xvdb 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
input {
file {
path => "C:\Workspace\test.dat"
start_position => beginning
ignore_older => 0
sincedb_path => "NUL"
}
}
filter {
mutate {
gsub => ["message", "\r", " "]
}
if [message] =~ "^Linux" or [message] =~ "^avg-cpu" or [message] =~ "^ " or [message] == "" or [message] =~ "^Device" {
drop {}
} else {
if [message] =~ "^zzz" {
grok {
match => ["message", "%{DATA:field1} +%{DATA:field2} +%{MONTH:month} +%{NUMBER:day} +%{TIME:time} +%{DATA:field4} +%{YEAR:year}"]
}
mutate {
add_field => {
"timestamp" => "%{day} %{month} %{year} %{time}"
}
}
} else {
grok {
match => ["message", "%{DATA:device} +%{NUMBER:read_request_merge_avg:float} +%{NUMBER:write_request_merge_avg:float} +%{NUMBER:read_iops_avg:float} +%{NUMBER:write_iops_avg:float} +%{NUMBER:MB_read_avg:float} +%{NUMBER:MB_write_avg:float} +%{NUMBER:avg_sector_size:float} +%{NUMBER:avg_queue_size:float} +%{NUMBER:io_wait_time_ms:float} +%{NUMBER:io_service_time_ms:float} +%{NUMBER:disk_util_perc:float}"]
}
}
}
}
output {
stdout {
codec => rubydebug
}
}
I can match the lines without any problems but what I really want to do is get the timestamp from the first match and add it as a field to the subsequent line matches in order to get it nicely into elasticsearch, e.g.:
{
"message" => "xvda 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 ",
"@version" => "1",
"@timestamp" => "2016-10-27T23:11:32.507Z",
"path" => "C:\\Users\\thomas.a.baker\\Downloads\\test.dat",
"host" => "CPX-I9CC2XTM7B9",
"device" => "xvda",
"read_request_merge_avg" => 0.0,
"write_request_merge_avg" => 0.0,
"read_iops_avg" => 0.0,
"write_iops_avg" => 0.0,
"MB_read_avg" => 0.0,
"MB_write_avg" => 0.0,
"avg_sector_size" => 0.0,
"avg_queue_size" => 0.0,
"io_wait_time_ms" => 0.0,
"io_service_time_ms" => 0.0,
"disk_util_perc" => 0.0
"timestamp" => "22 Sep 2016 08:00:27"
}
I would then use the date filter to replace @timestamp, but the challenge for me is making that timestamp available to be used again. Is this even possible?
Hope my question makes sense.
Thanks.