Elastic search indices, what variables are available?

sjlongland · August 19, 2016, 12:18am

Hi all,

I've hunted high and low through the documentation (written; not all of us have the luxury of watching videos) for information on this. In short, I have a prototype monitoring system using the elk-docker container (sebp/elk) running Elasticsearch/LogStash/Kibana, with data coming in from rsyslog and collectd.

So far so good, but on the Elastic Search index, I'm a little lost. The logstash output plug-in configuration is as follows:

output {
  elasticsearch {
    hosts => ["localhost"]
    sniffing => true
    manage_template => false
    index => "%{host}-%{+YYYY.MM.dd}"
    document_type => "%{type}"
  }
}

Now, that was found by experiment: the documentation says the following:

index
Value type is string

Default value is "logstash-%{+YYYY.MM.dd}"
The index to write events to. This can be dynamic using the %{foo} syntax.
The default value will partition your indices by day so you can more easily
delete old data or only search specific date ranges.
Indexes may not contain uppercase characters.
For weekly indexes ISO 8601 format is recommended, eg. logstash-%{+xxxx.ww}

I understand the concept that %{foo} can stand for a number of things. Through experiment I learned that %{host} and %{sysloghost} work. However the latter only works for things via rsyslog and the former is sometimes an IP address. I haven't figured out why it's inconsistent regarding IP vs hostname, but I'd prefer hostnames if at all possible. A compromise might be to use %{sysloghost} if available, or revert back to %{host}. In Bourne shell syntax, that would be ${sysloghost:?${host}}, I have no idea if LogStash/Elasticsearch support anything like this, as I'm yet to find a document that accurately describes the syntax and variables.

So, some queries:

Is there a document that formally defines the variables and syntax used for that formatting string?
Is there a reason why some logging messages are picking up the IP address some times and the hostname the other?

sjlongland · August 19, 2016, 4:18am

Okay, a small update…

I found I was able to achieve what I was after by using the mutate filter:

filter {
  if [type] == "rsyslog" {
mutate {
  replace => {
    "host" => "%{sysloghost}"
  }
}
  }
}

That got me half-way there. The other half was to tell rsyslog to give me the FQDN:

# Use fully qualified name in forwarded logs
$PreserveFQDN on

magnusbaeck · August 19, 2016, 7:26am

Is there a document that formally defines the variables and syntax used for that formatting string?

There's no field documentation since, apart from @timestamp and tags, there aren't really any standardized fields.

Is there a reason why some logging messages are picking up the IP address some times and the hostname the other?

That depends on the source data and the filters used. With the dns filter you can perform DNS lookups (forward and reverse) if you want to consistently store hostnames (or both hostnames and IP addresses).

sjlongland · August 22, 2016, 6:19am

Fair enough… it's just difficult navigating my way through the maze at the moment. Coming at this system for the first time, and I can see the power in this, but it's bewildering.

For the input, it pretty much is coming from two sources:

rsyslog (using the syslog input plug-in)
collectd

Both of these are set up on the host to forward to ports bound on the loopback interface, which Docker has mapped to the ELK-stack container through to logstash. As such, the only machine that is able to connect is the host running the ELK-stack container. Everything else talks via the rsyslog or collectd instance on the host.

That's why I was a little confused when the host field on rsyslog's messages alternated between a hostname and IP address. In any case, the above works around the issue.

It might be helpful to include in the input plug-in documentation what the typical output fields are, and perhaps in places such as when "%{foo} syntax" is mentioned, to see the input and filter plug-in documentation for hints on such fields.

magnusbaeck · August 22, 2016, 6:23am

It might be helpful to include in the input plug-in documentation what the typical output fields are, ...

I agree; the input plugins rarely specify which fields they emit and force you to find out for your self during testing. This is clearly an area of improvement.

Topic		Replies	Views
Custom logstash index name output to ES Logstash	2	2394	July 5, 2017
Default Index Pattern (logstash-yyyy.MM.dd) Failing Logstash	8	1011	October 25, 2022
ELK 8.x Logstash creates index with variable name instead variable value Kibana	2	2134	October 28, 2022
Variable on Logstash Logstash	3	319	February 12, 2019
Multiple ElasticSearch Indexes in Logstash Output Logstash	2	2224	July 6, 2017

Elastic search indices, what variables are available?

Related topics