Do you have control over the log, and old data is not necessary? You can run vmstat with the -t
option. It will output a timestamp per line, which can very easily be parsed.
$ vmstat -t -n 1 10
procs -----------memory---------- ---swap-- -----io---- -system-- ------cpu----- -----timestamp-----
r b swpd free buff cache si so bi bo in cs us sy id wa st EDT
2 0 5172 174696 32256 231448 0 0 42 112 29 35 0 0 99 0 0 2016-03-18 22:40:12
0 0 5172 174684 32256 231448 0 0 0 0 12 16 0 0 100 0 0 2016-03-18 22:40:13
0 0 5172 174684 32256 231448 0 0 0 0 7 8 0 0 100 0 0 2016-03-18 22:40:14
0 0 5172 174684 32256 231448 0 0 0 0 9 10 0 0 100 0 0 2016-03-18 22:40:15
0 0 5172 174684 32256 231448 0 0 0 0 8 10 0 0 100 0 0 2016-03-18 22:40:16
0 0 5172 174684 32256 231448 0 0 0 0 9 10 0 0 100 0 0 2016-03-18 22:40:17
0 0 5172 174684 32256 231448 0 0 0 0 7 8 0 0 100 0 0 2016-03-18 22:40:18
0 0 5172 174684 32256 231448 0 0 0 0 8 10 0 0 100 0 0 2016-03-18 22:40:19
0 0 5172 174684 32256 231448 0 0 0 0 8 10 0 0 100 0 0 2016-03-18 22:40:20
0 0 5172 174684 32256 231448 0 0 0 0 10 10 0 0 100 0 0 2016-03-18 22:40:21
Then you can simply do something like this.
filter {
if [type] == "vmstat" {
if [message] =~ "procs --" or [message] =~ "r b swpd"{
drop {}
}
csv {
separator => " "
columns => ["[vmstat][r]", "[vmstat][b]", "[vmstat][swpd]", "[vmstat][free]", "[vmstat][buff]",
"[vmstat][cache]", "[vmstat][si]", "[vmstat][so]", "[vmstat][bi]", "[vmstat][bo]", "[vmstat][in]",
"[vmstat][cs]", "[vmstat][us]", "[vmstat][sy]", "[vmstat][id]", "[vmstat][wa]", "[vmstat][st]", "date", "time"]
}
mutate {
convert => [
"[vmstat][r]", "integer",
"[vmstat][b]", "integer",
"[vmstat][swpd]", "integer",
"[vmstat][free]", "integer",
"[vmstat][buff]", "integer",
"[vmstat][cache]", "integer",
"[vmstat][si]", "integer",
"[vmstat][so]", "integer",
"[vmstat][bi]", "integer",
"[vmstat][bo]", "integer",
"[vmstat][in]", "integer",
"[vmstat][cs]", "integer",
"[vmstat][us]", "integer",
"[vmstat][sy]", "integer",
"[vmstat][id]", "integer",
"[vmstat][wa]", "integer",
"[vmstat][st]", "integer"
]
add_field => { "timestamp" => "%{date} %{time}" }
}
date {
match => ["timestamp", "YYYY-MM-dd HH:mm:ss"]
# remove time related fields once @timestamp has been set
remove_field => [ "date", "time", "timestamp" ]
}
}
}
Keep in mind, the timezone was not set. By default, logstash will use the timezone / offset of the system that it is running on.