Logstash Variables

I am doing my level best to understand where I define variable. Please forgive me as I am not very strong in unix as a whole and learning as I go.

But this page: https://www.elastic.co/guide/en/logstash/current/environment-variables.html

Really needs some help. More often than not, I cannot find where I need to define/declare/etc some configuration option and very, very, painful.

Let’s set the value of HOME:

export HOME="/path"

Can someone please explain exactly where I need to define this? I am running Ubuntu 16.04 LTS and I have tried the following:


No matter what I do, I cannot get logstash to see this declaration.

show us your config file and do a 'ps -ef www | grep logstash' to show the --allow-env is part of the logstash commandline

How are you starting Logstash?

I will post a copy of my config shortly.

Magnus, it is starting however the installer set it up.

Here you go:

logstash  2038     1 99 15:15 ?        SNsl   0:52 /usr/bin/java -XX:+UseParNewGC -XX:+UseConcMarkSweepGC -XX:CMSInitiatingOccupancyFraction                                                      =75 -XX:+UseCMSInitiatingOccupancyOnly -XX:+DisableExplicitGC -Djava.awt.headless=true -Dfile.encoding=UTF-8 -XX:+HeapDumpOnOutOfMemoryError                                                       -Xmx3g -Xms3g -Xss2048k -Djffi.boot.library.path=/usr/share/logstash/vendor/jruby/lib/jni -Xbootclasspath/a:/usr/share/logstash/vendor/jrub                                                      y/lib/jruby.jar -classpath : -Djruby.home=/usr/share/logstash/vendor/jruby -Djruby.lib=/usr/share/logstash/vendor/jruby/lib -Djruby.script=j                                                      ruby -Djruby.shell=/bin/sh org.jruby.Main --1.9 /usr/share/logstash/lib/bootstrap/environment.rb logstash/runner.rb --path.settings /etc/log                                                      stash

Okay, but are you using system logstash start or similar? And which version of Logstash is this?

Yes sir.

Logstash version 5.1.2.

What does your configuration look like, i.e. how are you referencing the environment variable? Does the Logstash process actually have the environment variable set? Check in /proc/PID/environ, where PID is the PID of the Logstash process.

My logstash boxes are 4 CPU, 8 gig of RAM, 50 gig of HDD. They do not handle local storage unless I am debugging something. Otherwise, they process syslogs, winlogbeat, etc, and pipe to both data center clusters.

Here is my logstash.yml; (Same setup for all logstash boxes)

> # Settings file in YAML
> #
> # Settings can be specified either in hierarchical form, e.g.:
> #
> #   pipeline:
> #     batch:
> #       size: 125
> #       delay: 5
> #
> # Or as flat keys:
> #
> #   pipeline.batch.size: 125
> #   pipeline.batch.delay: 5
> #
> # ------------  Node identity ------------
> #
> # Use a descriptive name for the node:
> #
> node.name: <Server Name>
> #
> # If omitted the node name will default to the machine's host name
> #
> # ------------ Data path ------------------
> #
> # Which directory should be used by logstash and its plugins
> # for any persistent needs. Defaults to LOGSTASH_HOME/data
> #
> path.data: /var/lib/logstash
> #
> # ------------ Pipeline Settings --------------
> #
> # Set the number of workers that will, in parallel, execute the filters+outputs
> # stage of the pipeline.
> #
> # This defaults to the number of the host's CPU cores.
> #
> pipeline.workers: 16
> #
> # How many workers should be used per output plugin instance
> #
> pipeline.output.workers: 8
> #
> # How many events to retrieve from inputs before sending to filters+workers
> #
> # pipeline.batch.size: 125
> #
> # How long to wait before dispatching an undersized batch to filters+workers
> # Value is in milliseconds.
> #
> # pipeline.batch.delay: 5
> #
> # Force Logstash to exit during shutdown even if there are still inflight
> # events in memory. By default, logstash will refuse to quit until all
> # received events have been pushed to the outputs.
> #
> # WARNING: enabling this can lead to data loss during shutdown
> #
> # pipeline.unsafe_shutdown: false
> #
> # ------------ Pipeline Configuration Settings --------------
> #
> # Where to fetch the pipeline configuration for the main pipeline
> #
> path.config: /etc/logstash/conf.d
> #
> # Pipeline configuration string for the main pipeline
> #
> # config.string:
> #
> # At startup, test if the configuration is valid and exit (dry run)
> #
> # config.test_and_exit: false
> #
> # Periodically check if the configuration has changed and reload the pipeline
> # This can also be triggered manually through the SIGHUP signal
> #
> config.reload.automatic: true
> #
> # How often to check if the pipeline configuration has changed (in seconds)
> #
> # config.reload.interval: 3
> #
> # Show fully compiled configuration as debug log message
> # NOTE: --log.level must be 'debug'
> #
> # config.debug: false
> #
> # ------------ Queuing Settings --------------
> #
> # Internal queuing model, "memory" for legacy in-memory based queuing and
> # "persisted" for disk-based acked queueing. Defaults is memory
> #
> # queue.type: memory
> #
> # If using queue.type: persisted, the directory path where the data files will be stored.
> # Default is path.data/queue
> #
> # path.queue:
> #
> # If using queue.type: persisted, the page data files size. The queue data consists of
> # append-only data files separated into pages. Default is 250mb
> #
> # queue.page_capacity: 250mb
> #
> # If using queue.type: persisted, the maximum number of unread events in the queue.
> # Default is 0 (unlimited)
> #
> # queue.max_events: 0
> #
> # If using queue.type: persisted, the total capacity of the queue in number of bytes.
> # If you would like more unacked events to be buffered in Logstash, you can increase the 
> # capacity using this setting. Please make sure your disk drive has capacity greater than 
> # the size specified here. If both max_bytes and max_events are specified, Logstash will pick 
> # whichever criteria is reached first
> # Default is 1024mb or 1gb
> #
> # queue.max_bytes: 1024mb
> #
> # If using queue.type: persisted, the maximum number of acked events before forcing a checkpoint
> # Default is 1024, 0 for unlimited
> #
> # queue.checkpoint.acks: 1024
> #
> # If using queue.type: persisted, the maximum number of written events before forcing a checkpoint
> # Default is 1024, 0 for unlimited
> #
> # queue.checkpoint.writes: 1024
> #
> # If using queue.type: persisted, the interval in milliseconds when a checkpoint is forced on the head page
> # Default is 1000, 0 for no periodic checkpoint.
> #
> # queue.checkpoint.interval: 1000
> #
> # ------------ Metrics Settings --------------
> #
> # Bind address for the metrics REST endpoint
> #
> # http.host: ""
> #
> # Bind port for the metrics REST endpoint, this option also accept a range
> # (9600-9700) and logstash will pick up the first available ports.
> #
> # http.port: 9600-9700
> #
> # ------------ Debugging Settings --------------
> #
> # Options for log.level:
> #   * fatal
> #   * error
> #   * warn
> #   * info (default)
> #   * debug
> #   * trace
> #
> # log.level: info
> path.logs: /var/log/logstash
> #
> # ------------ Other Settings --------------
> #
> # Where to find custom plugins
> # path.plugins: []

Format the config file snippet as preformatted text (use the toolbar button).


Um, sorry. Not settings.yml but the files in /etc/logstash/conf.d.

And what about /proc/PID/environ, is the variable set at all?

I have a config file for each input pipeline I've established:

# LOGNTWK Input Parameters
input {
  udp {
    port => 31514
    buffer_size => 131072
    workers => 8
    add_field => [ "pipeline", "LOGNTWK" ]
    add_field => [ "pipeline_protocol", "udp" ]
    add_field => [ "pipeline_port", "31514" ]

Then I have a config file for tagging device types:

filter {
  if [pipeline] == "LOGNTWK" {
    if [message] =~ /(?i)(-FA)/ {
      mutate { add_field => [ "device_type", "Floor Access Switch" ] }
   else if [message] =~ /(?i)-RT0/ {
      mutate { add_field => [ "device_type", "WAN Router" ] }
    else if [message] =~ /(?i)-DR0/ {
      mutate { add_field => [ "device_type", "DMVPN Router" ] }
    else if [message] =~ /(?i)-CS0/ {
      mutate { add_field => [ "device_type", "Core Switch" ] }
   else if [message] =~ /(?i)-DS0/ {
      mutate { add_field => [ "device_type", "Distribution Switch" ] }
   else if [message] =~ /(?i)-SA/ {
      mutate { add_field => [ "device_type", "Server Access Switch" ] }
   else if [message] =~ /(?i)-IR0/ {
      mutate { add_field => [ "device_type", "Internet Router" ] }
    else if [message] =~ /(?i)-VA0/ {
      mutate { add_field => [ "device_type", "Video Access Switch" ] }
    else if [message] =~ /(?i)-VR0/ {
      mutate { add_field => [ "device_type", "Voice Router" ] }
    else if [message] =~ /(?i)-VG0/ {
      mutate { add_field => [ "device_type", "Voice Gateway" ] }
   else if [host] =~ /(?i)10.([0-9]{1,3}).19.(10|11|12)/ {
      mutate { add_field => [ "device_type", "Cisco Fabric Interconnect" ] }
    else {
      mutate { add_field => [ "device_type", "Unknown" ] }

Then I have a config file for GROK'ing

# LOGNTWK Grok Parameters
filter {
  if [pipeline] == "LOGNTWK" {
    if [device_type] != "Cisco Fabric Interconnect" {
      grok {
        break_on_match => true
        match => [
          "message", "<%{INT}>%{DATA:syslog_timestamp} %{IPV4:device_ip} <%{INT:syslog_pri}>%{INT:cisco_sequence_number}: %{DATA:hostname}: %{CISCOTIMESTAMP:@timestamp} %{TZ:timezone}: %%{WORD:cisco_facility}-%{WORD:cisco_severity}-%{WORD:cisco_mnemonic}: %{GREEDYDATA:syslog_body}"
      if "Login Success" in [syslog_body] {
        grok {
          break_on_match => true
          match => [
            "syslog_body", "Login Success \[user: %{DATA:user_id}\] \[Source: %{IPV4:client_ip}\] \[localport: %{INT:port}\]",
            "syslog_body", "Login Success \[user: %{DATA:user_id}\] \[Source: %{DATA:client_ip}\] \[localport: %{INT:port}\]"
      if [cisco_facility] == "AUTHMGR" and [cisco_mnemonic] == "START" {
        grok {
          match => { "syslog_body" => "Starting '%{DATA:dot1x_authentication_type}' for client \(%{MAC:client_mac}\) on Interface %{DATA:device_interface} AuditSessionID %{GREEDYDATA:dot1x_session_id}" }
      if [cisco_facility] == "EPM" and [cisco_mnemonic] == "POLICY_REQ" {
        grok {
          match => { "syslog_body" => "IP %{IPV4:client_ip}\| MAC %{MAC:client_mac}\| AuditSessionID %{DATA:dot1x_session_id}\| AUTHTYPE %{GREEDYDATA:dot1x_authentication_type}\|" }
      if [cisco_facility] == "SEC" and [cisco_mnemonic] == "IPACCESSLOGS" {
        grok {
          match => { "syslog_body" => "list %{NOTSPACE:access_control_list_name} %{NOTSPACE:access_control_list_action} %{IPV4:source_ip} %{INT:source_attempts} packets" }
      if [cisco_facility] == "RADIUS" and [cisco_mnemonic] == "NOSERVERS" {
        grok {
          match => { "syslog_body" => "No Radius hosts configured or no valid server present in the server group +%{GREEDYDATA:radius_group}" }
      if ([cisco_facility] =~ /(AUTHMGR|DOT1X)/) and [cisco_mnemonic] == "FAIL" {
        grok {
          break_on_match => true
          match => [
            "syslog_body", "%{NOTSPACE:aaa_type} %{NOTSPACE:aaa_status} for client \(%{MAC:client_mac}\) on Interface %{DATA:device_interface} AuditSessionID %{GREEDYDATA:dot1x_session_id}",
            "syslog_body", "%{NOTSPACE:aaa_type} %{NOTSPACE:aaa_status} for client \(%{DATA:client_mac}\) on Interface %{DATA:device_interface} AuditSessionID %{GREEDYDATA:dot1x_session_id}"
      if [cisco_facility] == "ISDN" and [cisco_mnemonic] == "CONNECT" {
        grok {
          match => { "syslog_body" => "Interface %{DATA:device_interface} is now connected to %{NOTSPACE:destination_phone_number} %{GREEDYDATA:destination_name}" }
      if [cisco_facility] == "ISDN" and [cisco_mnemonic] == "DISCONNECT" {
        grok {
          match => { "syslog_body" => "Interface %{DATA:device_interface} +disconnected from %{NOTSPACE:destination_phone_number} , call lasted %{INT:call_duration} seconds" }
      if [cisco_facility] == "ILPOWER" and ([cisco_mnemonic] =~ /(POWER_GRANTED|IEEE_DISCONNECT)/) {
        grok {
          break_on_match => true
          match => [
            "syslog_body", "Interface %{DATA:device_interface}: Power %{GREEDYDATA:power_status}",
            "syslog_body", "Interface %{DATA:device_interface}: %{DATA:disconnect_message} \(%{DATA:device_hostname}\)",
            "syslog_body", "Interface %{DATA:device_interface}: %{GREEDYDATA:device_message}"
      if [cisco_facility] == "ILPOWER" and [cisco_mnemonic] == "CONTROLLER_PORT_ERR" {
        grok {
          match => { "syslog_body" => "Controller port error, Interface %{DATA:device_interface}: Power Controller reports %{GREEDYDATA:device_message}" }
      if [cisco_facility] == "SYS" and [cisco_mnemonic] == "CONFIG_I" {
        grok {
          break_on_match => true
          match => [
            "syslog_body", "Configured from console by %{DATA:user_id} on %{DATA:device_interface} \(%{IPV4:source_ip}\)",
            "syslog_body", "Configured from console by %{DATA:user_id} on %{DATA:device_interface} \(%{DATA:script_name}\)",
            "syslog_body", "Configured from %{IP:source_ip} by %{DATA:application}"
     if "ROUTE_TABLE_CHANGE" in [syslog_body] {
        grok {
          match => { "syslog_body" => "%{DATA:eem_script_name}: 0\(%{DATA:route_table_action}\): %{IPV4:network_route}/0 \| Interface: %{DATA:device_interface}\(%{DATA:device_interface_name}\)" }
    else if [device_type] == "Cisco Fabric Interconnect" {
      grok {
        break_on_match => true
        match => [
          "message", "<%{INT}>%{SYSLOGTIMESTAMP@timestamp} %{IPV4:device_ip} <%{INT:syslog_pri}>: %{DATA} %{TZ:timezone}: %%{WORD:cisco_facility}-%{WORD:cisco_severity}-%{WORD:cisco_mnemonic}: +%{GREEDYDATA:syslog_body}",
          "message", "<%{INT}>%{SYSLOGTIMESTAMP@timestamp} %{IPV4:device_ip} <%{INT:syslog_pri}>: %{DATA} %{TZ:timezone}: %{DATA} %%{WORD:cisco_facility}-%{WORD:cisco_severity}-%{WORD:cisco_mnemonic}: +%{GREEDYDATA:syslog_body}",
          "message", "<%{INT}>%{SYSLOGTIMESTAMP@timestamp} %{IPV4:device_ip} <%{INT:syslog_pri}>: %{DATA} %{TZ:timezone}: +%{GREEDYDATA:syslog_body}"

I have a couple config files for mutations:

# Mutate Parameters - Syslog Priority
filter {
  syslog_pri { }

# Mutate Parameters - Uppercase Characters
filter {
  if [aaa_accounting_flag] =~ /.+/ {
    mutate { uppercase => [ "aaa_accounting_flag" ] }
  if [aaa_protocol] =~ /.+/ {
    mutate { uppercase => [ "aaa_protocol" ] }
  if [access_control_list_action] =~ /.+/ {
    mutate { uppercase => [ "access_control_list_action" ] }
  if [access_control_rule_action] =~ /.+/ {
    mutate { uppercase => [ "access_control_rule_action" ] }
  if [acs_protocol] =~ /.+/ {
    mutate { uppercase => [ "acs_protocol" ] }
  if [application_protocol] =~ /.+/ {
    mutate { uppercase => [ "application_protocol" ] }
  if [device_hostname] =~ /.+/ {
    mutate { uppercase => [ "device_hostname" ] }
  if [dns_record_status] =~ /.+/ {
    mutate { uppercase => [ "dns_record_status" ] }
  if [dot1x_authentication_type] =~ /.+/ {
    mutate { uppercase => [ "dot1x_authentication_type" ] }
  if [esx_log_level] =~ /.+/ {
    mutate { uppercase => [ "esx_log_level" ] }
  if [esx_vm_name] =~ /.+/ {
    mutate { uppercase => [ "esx_vm_name" ] }
  if [esx_vmx_name] =~ /.+/ {
    mutate { uppercase => [ "esx_vmx_name" ] }
  if [event_user] =~ /.+/ {
    mutate { uppercase => [ "event_user" ] }
  if [facility_label] =~ /.+/ {
    mutate { uppercase => [ "facility_label" ] }
  if [host] =~ /.+/ {
    mutate { uppercase => [ "host" ] }
  if [hostname] =~ /.+/ {
    mutate { uppercase => [ "hostname" ] }
  if [level] =~ /.+/ {
    mutate { uppercase => [ "level" ] }
  if [log_name] =~ /.+/ {
    mutate { uppercase => [ "log_name" ] }
  if [logon_id] =~ /.+/ {
    mutate { uppercase => [ "logon_id" ] }
  if [nas_device_hostname] =~ /.+/ {
    mutate { uppercase => [ "nas_device_hostname" ] }
  if [nas_protocol] =~ /.+/ {
    mutate { uppercase => [ "nas_protocol" ] }
  if [prime_service] =~ /.+/ {
    mutate { uppercase => [ "prime_service" ] }
  if [process_name] =~ /.+/ {
    mutate { uppercase => [ "process_name" ] }
  if [protocol] =~ /.+/ {
    mutate { uppercase => [ "protocol" ] }
  if [severity_label] =~ /.+/ {
    mutate { uppercase => [ "severity_label" ] }
  if [sip_dns_name] =~ /.+/ {
    mutate { uppercase => [ "sip_dns_name" ] }
  if [sfr_connection_type] =~ /.+/ {
    mutate { uppercase => [ "sfr_connection_type" ] }
  if [syslog_facility] =~ /.+/ {
    mutate { uppercase => [ "syslog_facility" ] }
  if [syslog_severity] =~ /.+/ {
    mutate { uppercase => [ "syslog_severity" ] }
  if [user_identity_restricted] =~ /.+/ {
    mutate { uppercase => [ "user_identity_restricted" ] }
  if [user_id] =~ /.+/ {
    mutate { uppercase => [ "user_id" ] }
  if [vcs_action] =~ /.+/ {
    mutate { uppercase => [ "vcs_action" ] }
  if [vcs_destination_host] =~ /.+/ {
    mutate { uppercase => [ "vcs_destination_host" ] }
  if [vcs_method] =~ /.+/ {
    mutate { uppercase => [ "vcs_method" ] }
  if [vcs_service] =~ /.+/ {
    mutate { uppercase => [ "vcs_service" ] }
  if [vpn_user] =~ /.+/ {
    mutate { uppercase => [ "vpn_user" ] }

# Mutate Parameters - Remove Tag
filter {
  if [pipeline] == "LOGBEATS" {
    mutate { remove_tag => [ "beats_input_codec_plain_applied" ] }

Lastly, I have a config file that establishes log's origin:

# Mutate Parameters - City Name
filter {
  if [host] =~ /10\.1\.([0-9]{1,3})/ {
    mutate { add_field => [ "city", "Memphis" ] }
  else if [host] =~ /10\.2\.([0-9]{1,3})/ {
    mutate { add_field => [ "city", "Jackson" ] }
  else if [host] =~ /10\.3\.([0-9]{1,3})/ {
    mutate { add_field => [ "city", "Nashville" ] }
  else if [host] =~ /10\.4\.([0-9]{1,3})/ {
    mutate { add_field => [ "city", "Chattanooga" ] }
  else if [host] =~ /10\.5\.([0-9]{1,3})/ {
    mutate { add_field => [ "city", "Knoxville" ] }
  else if [host] =~ /10\.7\.([0-9]{1,3})/ {
    mutate { add_field => [ "city", "Johnson City" ] }
  else if [host] =~ /10\.8\.([0-9]{1,3})/ {
    mutate { add_field => [ "city", "Washington" ] }
  else if [host] =~ /10\.10\.([0-9]{1,3})/ {
    mutate { add_field => [ "city", "Birmingham" ] }
  else if [host] =~ /10\.11\.([0-9]{1,3})/ {
    mutate { add_field => [ "city", "New Orleans" ] }
  else if [host] =~ /10\.12\.([0-9]{1,3})/ {
    mutate { add_field => [ "city", "Mandeville" ] }
  else if [host] =~ /10\.13\.([0-9]{1,3})/ {
    mutate { add_field => [ "city", "Baton Rouge" ] }
  else if [host] =~ /10\.14\.([0-9]{1,3})/ {
    mutate { add_field => [ "city", "East Memphis" ] }
  else if [host] =~ /10\.15\.([0-9]{1,3})/ {
    mutate { add_field => [ "city", "Montgomery" ] }
  else if [host] =~ /10\.16\.([0-9]{1,3})/ {
    mutate { add_field => [ "city", "Macon" ] }
  else if [host] =~ /10\.17\.([0-9]{1,3})/ {
    mutate { add_field => [ "city", "Houston" ] }
  else if [host] =~ /10\.18\.([0-9]{1,3})/ {
    mutate { add_field => [ "city", "Orlando" ] }
  else if [host] =~ /10\.19\.([0-9]{1,3})/ {
    mutate { add_field => [ "city", "Fort Lauderdale" ] }
  else if [host] =~ /10\.20\.([0-9]{1,3})/ {
    mutate { add_field => [ "city", "Memphis Lab" ] }
  else if [host] =~ /10\.21\.([0-9]{1,3})/ {
    mutate { add_field => [ "city", "Knoxvile Lab" ] }
  else if [host] =~ /10\.22\.([0-9]{1,3})/ {
    mutate { add_field => [ "city", "Tallahassee" ] }
  else if [host] =~ /10\.23\.([0-9]{1,3})/ {
    mutate { add_field => [ "city", "Jacksonville" ] }
  else if [host] =~ /10\.24\.([0-9]{1,3})/ {
    mutate { add_field => [ "city", "Columbia" ] }
  else if [host] =~ /10\.25\.([0-9]{1,3})/ {
    mutate { add_field => [ "city", "Atlanta" ] }
  else if [host] =~ /10\.26\.([0-9]{1,3})/ {
    mutate { add_field => [ "city", "Baltimore" ] }
  else if [host] =~ /10\.27\.([0-9]{1,3})/ {
    mutate { add_field => [ "city", "City Center" ] }
  else if [host] =~ /10\.28\.([0-9]{1,3})/ {
    mutate { add_field => [ "city", "Towson" ] }
  else if [host] =~ /10\.92\.([0-9]{1,3})/ {
    mutate { add_field => [ "city", "Home Router" ] }
  else if [host] =~ /10\.101\.([0-9]{1,3})/ {
    mutate { add_field => [ "city", "Atlanta" ] }
  else if [host] =~ /10\.102\.([0-9]{1,3})/ {
    mutate { add_field => [ "city", "Scottsdale" ] }
  else {
    mutate { add_field => [ "city", "Unknown" ] }

Finally, the outputs:
Localhost is only enabled if I am doing some local debugging of some sort.

# ELASTICSAERCH Output Parameters (Localhost)
#output {
# elasticsearch {
#   hosts => ["http://localhost:9200"]
#   index => "logstash-%{+YYYY.MM.dd}"
# }
#  stdout { codec => rubydebug }

First data center cluster target:

# ELASTICSAERCH Output Parameters
output {
  elasticsearch {
    hosts => ["http://xxx.xxx.xxx.xxx:9200","http://xxx.xxx.xxx.xxx:9200",""]
    index => "logstash-%{+YYYY.MM.dd}"
#  stdout { codec => rubydebug }

Second data center target:

# ELASTICSAERCH Output Parameters
output {
  elasticsearch {
    hosts => ["http://xxx.xxx.xxx.xxx:9200","http://xxx.xxx.xxx.xxx:9200",""]
    index => "logstash-%{+YYYY.MM.dd}"
#  stdout { codec => rubydebug }

Okay, but where's the environment variable reference?

I removed it, it was killing Logstash.

It was this:

# Mutate Parameters - Pipeline Ingress
filter {
  mutate {
    add_field => { "pipeline_ingress" => "$node.name" }

And my environment file is this:

SERVER:/etc/logstash/conf.d$ sudo cat /etc/environment


I am on Ubuntu 16.04 LTS.