Poor kibana behaviour with exec input from df command "/"?

Thorn · March 23, 2016, 11:16am

In testing I notice kibana seems to break a string field containing path delimiters into separate items - is this deliberate? Should I change the approach to fix it?

having parsed a df command with the exec input , and am using split{} csv{} to filter (couldn't get a grok to work, so went to this).

input {
  exec {
    type => "dfcsv"
    command => "df -ah -B1 --output=target,source,used,avail,size <lots of sed here>"
    interval => 8
  }
}

filter {
  if [type] == "dfcsv" {
    split {}
    csv {
      columns => ["mounted_on", "filesystem", "df_used", "df_avail", "df_size"]
      separator => ","
      convert => {"df_used" => "integer" "df_avail" => "integer" "df_size" => "integer"}
    }
  }
}

the df from shell with sed produces data like:

/,/dev/sda1,3031506944,2084298752,5414436864
/proc,proc,0,0,0
/sys,sysfs,0,0,0
/sys/fs/cgroup,none,0,4096,4096
/sys/fs/fuse/connections,none,0,0,0
/sys/kernel/debug,none,0,0,0
/sys/kernel/security,none,0,0,0
/dev,udev,4096,506908672,506912768
/dev/pts,devpts,0,0,0
/run,tmpfs,528384,103620608,104148992
/run/lock,none,0,5242880,5242880
/run/shm,none,4096,520724480,520728576
/run/user,none,0,104857600,104857600
/sys/fs/pstore,none,0,0,0
/sys/fs/cgroup/systemd,systemd,0,0,0
/media/sf_VboxFiles,VboxFiles,179655458816,53891772416,233547231232
/mnt/VboxFiles,VboxFiles,179655458816,53891772416,233547231232

which all seems to work - I get one message generated per each line of df returned, saying logstash is putting what it should into Elastic - and the index is created if required.

"@timestamp" => "2016-03-23T10:28:47.331Z",
      "type" => "dfcsv",
      "host" => "aserver_somehere",
   "command" => " df -ah -B1 -output=target,source,used,avail,size <again, lots of sed here>
"mounted_on" => "/mnt/VboxFiles",
"filesystem" => "VboxFiles",
   "df_used" => 179656765440,
  "df_avail" => 53890465792,
   "df_size" => 233547231232

"mounted_on" and "filesystem" are clearly strings.

But once in Kibana, these fields get split somehow, so mounted_on becomes like

vboxfiles	
user		
systemd	
sys

instead of like

 /sys/fs/pstore
 /sys/fs/cgroup/systemd
 /media/sf_VboxFiles
 /mnt/VboxFiles etc

also 'filesystem' has 'dev' and 'pts' in seperate items where they should be together:

`/dev/pts`

and the other values don't always display (but are searchable) - I assume because the numeric data is improperly associated with the end values (although its not consistent. and I can search for substrings in the visualizations and they show up - still not what I want, of course.

I'm 100% it's the "/" being escaped incorrectly and I need to substitute or add something to stop the behavior - I can make it stop by replacing "/" with "." everywhere - but then the path data is, well, wrong. In know grokking has the {%PATH} pattern, but that didn't manage df's output of 'none' in the filesystem field, so I abandoned it.

Reltively new to ELK, so perhaps am doing something tragically obvious - but any assist much appreciated.

Thomas

magnusbaeck · March 23, 2016, 11:18am

The field with the path is being analyzed, i.e. broken up into tokens, by ES. You should consider making this field non-analyzed by changing the index template that Logstash applies.

Thorn · March 23, 2016, 11:20am

Thanks magnus, will go into that further.