Logstash node is missing from Monitoring in Kibana

Hi All,

We have two logstash nodes running in our DEV cluster and only one is visible in Kibana under monitoring.

I compared the logstash.yml file of both the nodes side-by-side but I could not find any differences between them except for the node name.

Below is the logstash.yml for your reference

ELK stack version: 5.4.3

# Settings file in YAML
#
# Settings can be specified either in hierarchical form, e.g.:
#
###   pipeline:
###     batch:
###       size: 125
###       delay: 5
#
# Or as flat keys:
#
#   pipeline.batch.size: 125
#   pipeline.batch.delay: 5
#
# ------------  Node identity ------------
#
# Use a descriptive name for the node:
#
#
node.name: ebb-1logstash-dev
# If omitted the node name will default to the machine's host name
#
# ------------ Data path ------------------
#
# Which directory should be used by logstash and its plugins
# for any persistent needs. Defaults to LOGSTASH_HOME/data
#
path.data: /opt/elk/data/elk/logstash-5.4.3/data
#
# ------------ Pipeline Settings --------------
#
# Set the number of workers that will, in parallel, execute the filters+outputs
# stage of the pipeline.
#
# This defaults to the number of the host's CPU cores.
#
# pipeline.workers: 2
#
# How many workers should be used per output plugin instance
#
# pipeline.output.workers: 1
#
# How many events to retrieve from inputs before sending to filters+workers
#
# pipeline.batch.size: 125
#
# How long to wait before dispatching an undersized batch to filters+workers
# Value is in milliseconds.
#
# pipeline.batch.delay: 5
#
# Force Logstash to exit during shutdown even if there are still inflight
# events in memory. By default, logstash will refuse to quit until all
# received events have been pushed to the outputs.
#
# WARNING: enabling this can lead to data loss during shutdown
#
# pipeline.unsafe_shutdown: false
#
# ------------ Pipeline Configuration Settings --------------
#
# Where to fetch the pipeline configuration for the main pipeline
#
path.config: /opt/elk/data/elk/logstash-5.4.3/config
#
# Pipeline configuration string for the main pipeline
#
# config.string:
#
# At startup, test if the configuration is valid and exit (dry run)
#
# config.test_and_exit: false
#
# Periodically check if the configuration has changed and reload the pipeline
# This can also be triggered manually through the SIGHUP signal
#
# config.reload.automatic: false
#
# How often to check if the pipeline configuration has changed (in seconds)
#
# config.reload.interval: 3
#
# Show fully compiled configuration as debug log message
# NOTE: --log.level must be 'debug'
#
# config.debug: false
#
# When enabled, process escaped characters such as \n and \" in strings in the
# pipeline configuration files.
#
# config.support_escapes: false
#
# ------------ Module Settings ---------------
# Define modules here.  Modules definitions must be defined as an array.
# The simple way to see this is to prepend each `name` with a `-`, and keep
# all associated variables under the `name` they are associated with, and 
# above the next, like this:
#
# modules:
#   - name: MODULE_NAME
#     var.PLUGINTYPE1.PLUGINNAME1.KEY1: VALUE
#     var.PLUGINTYPE1.PLUGINNAME1.KEY2: VALUE
#     var.PLUGINTYPE2.PLUGINNAME1.KEY1: VALUE
#     var.PLUGINTYPE3.PLUGINNAME3.KEY1: VALUE
#
# Module variable names must be in the format of 
#
# var.PLUGIN_TYPE.PLUGIN_NAME.KEY
#
# modules:
#
# ------------ Queuing Settings --------------
#
# Internal queuing model, "memory" for legacy in-memory based queuing and
# "persisted" for disk-based acked queueing. Defaults is memory
#
# queue.type: memory
#
# If using queue.type: persisted, the directory path where the data files will be stored.
# Default is path.data/queue
#
# path.queue:
#
# If using queue.type: persisted, the page data files size. The queue data consists of
# append-only data files separated into pages. Default is 250mb
#
# queue.page_capacity: 250mb
#
# If using queue.type: persisted, the maximum number of unread events in the queue.
# Default is 0 (unlimited)
#
# queue.max_events: 0
#
# If using queue.type: persisted, the total capacity of the queue in number of bytes.
# If you would like more unacked events to be buffered in Logstash, you can increase the
# capacity using this setting. Please make sure your disk drive has capacity greater than
# the size specified here. If both max_bytes and max_events are specified, Logstash will pick
# whichever criteria is reached first
# Default is 1024mb or 1gb
#
# queue.max_bytes: 1024mb
#
# If using queue.type: persisted, the maximum number of acked events before forcing a checkpoint
# Default is 1024, 0 for unlimited
#
# queue.checkpoint.acks: 1024
#
# If using queue.type: persisted, the maximum number of written events before forcing a checkpoint
# Default is 1024, 0 for unlimited
#
# queue.checkpoint.writes: 1024
#
# If using queue.type: persisted, the interval in milliseconds when a checkpoint is forced on the head page
# Default is 1000, 0 for no periodic checkpoint.
#
# queue.checkpoint.interval: 1000
#
# ------------ Dead-Letter Queue Settings --------------
# Flag to turn on dead-letter queue.
#
# dead_letter_queue.enable: false

# If using dead_letter_queue.enable: true, the maximum size of each dead letter queue. Entries
# will be dropped if they would increase the size of the dead letter queue beyond this setting.
# Default is 1024mb
# dead_letter_queue.max_bytes: 1024mb

# If using dead_letter_queue.enable: true, the directory path where the data files will be stored.
# Default is path.data/dead_letter_queue
#
# path.dead_letter_queue:
#
# ------------ Metrics Settings --------------
#
# Bind address for the metrics REST endpoint
#
# http.host: "172.29.140.26"
#
# Bind port for the metrics REST endpoint, this option also accept a range
# (9600-9700) and logstash will pick up the first available ports.
#
# http.port: 9600-9700
#
# ------------ Debugging Settings --------------
#
# Options for log.level:
#   * fatal
#   * error
#   * warn
#   * info (default)
#   * debug
#   * trace
#
# log.level: info
path.logs: /opt/elk/data/elk/logstash-5.4.3/logs
#
# ------------ Other Settings --------------
#
# Where to find custom plugins
# path.plugins: []
xpack.monitoring.enabled: "true"
xpack.monitoring.elasticsearch.url: ["http://102.15.140.51:9200","http://102.15.140.52:9200","http://102.15.140.53:9200"]
xpack.monitoring.elasticsearch.username: "logstash_system"
xpack.monitoring.elasticsearch.password: REMOVED

How did you setup the Logstash nodes? Did you happen to copy/paste or reuse the same container image? If you hit the http endpoint (default is http://localhost:9600/) for each logstash node, do you see unique ids? Can you paste the output for both?

It is not a container image and it is a legacy setup where it is running on a metal server.

Logstash is not run as a service but we run it using a script where it points to a console.log & error.log for logging. Then we use a 'conf' file for the input - filter - output setup where it works fine!

I could see the logs in console.log in both the nodes.

Also I could not hit the endpoint which is http://FQDN:9600 as well as tried https://FQDN:9600 where there is output for both the nodes.

As it is version 5.4.3, could this be ruled out as a bug since PROD exactly mirrors the same setup where the nodes are enlisted fine when viewed in Kibana there?

What do you get when you run this query against the cluster containing the monitoring data?

POST .monitoring-logstash-*/_search
{
  "size": 0,
  "aggs": {
    "ids": {
      "terms": {
        "field": "logstash_stats.logstash.uuid",
        "size": 10
      }
    }
  }
}

I get the below result in DEV:

{
  "took": 242,
  "timed_out": false,
  "_shards": {
    "total": 6,
    "successful": 6,
    "failed": 0
  },
  "hits": {
    "total": 89765,
    "max_score": 0,
    "hits": []
  },
  "aggregations": {
    "ids": {
      "doc_count_error_upper_bound": 0,
      "sum_other_doc_count": 0,
      "buckets": [
        {
          "key": "192b14df-fdfb-40d9-be45-4994e460c011",
          "doc_count": 89765
        }
      ]
    }
  }
}

In PROD i get the below:

{
  "took": 104,
  "timed_out": false,
  "_shards": {
    "total": 7,
    "successful": 7,
    "failed": 0
  },
  "hits": {
    "total": 117280,
    "max_score": 0,
    "hits": []
  },
  "aggregations": {
    "ids": {
      "doc_count_error_upper_bound": 0,
      "sum_other_doc_count": 0,
      "buckets": [
        {
          "key": "125882ac-0adf-4517-982c-0744c2edef61",
          "doc_count": 58642
        },
        {
          "key": "867f966c-0b88-48df-8f1e-f3d48a299ad5",
          "doc_count": 58638
        }
      ]
    }
  }
}

Found the solution here:

Issue seems to be with the UUID being same for both the nodes, so deleted the UUID in one of the nodes and restarted logstash there.

@chrisronline: Thanks for your support!

1 Like

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.