Kibana Duplicate data

rp346 · September 3, 2021, 3:22pm

I have ELSK Steup to aggregate Kubernetes logs. But in Kibana I see all logs appearing twice, here is the sample

10:43:02.416
[2021-09-03 14:43:02,414] [INFO] - Encoding password to hash!
10:43:02.416
[2021-09-03 14:43:02,415] [INFO] - Checking the password matching with hash
10:43:02.416
[2021-09-03 14:43:02,415] [WARNING] - Email verification is pending!
10:43:02.416
[2021-09-03 14:43:02,414] [INFO] - Encoding password to hash!
10:43:02.416
[2021-09-03 14:43:02,415] [INFO] - Checking the password matching with hash
10:43:02.416
[2021-09-03 14:43:02,415] [WARNING] - Email verification is pending!
10:54:17.290
[2021-09-03 14:54:17,290] [INFO] - Encoding password to hash!
10:54:17.290
[2021-09-03 14:54:17,290] [INFO] - Checking the password matching with hash
10:54:17.290
[2021-09-03 14:54:17,290] [WARNING] - Email verification is pending!
10:54:17.290
[2021-09-03 14:54:17,290] [INFO] - Encoding password to hash!
10:54:17.290
[2021-09-03 14:54:17,290] [INFO] - Checking the password matching with hash
10:54:17.290
[2021-09-03 14:54:17,290] [WARNING] - Email verification is pending!
10:54:40.719
[2021-09-03 14:54:40,719] [INFO] - Encoding password to hash!
10:54:40.719
[2021-09-03 14:54:40,719] [INFO] - Checking the password matching with hash
10:54:40.719
[2021-09-03 14:54:40,719] [WARNING] - Email verification is pending!
10:54:40.719
[2021-09-03 14:54:40,719] [INFO] - Encoding password to hash!
10:54:40.719
[2021-09-03 14:54:40,719] [INFO] - Checking the password matching with hash
10:54:40.719
[2021-09-03 14:54:40,719] [WARNING] - Email verification is pending!
10:54:41.834
[2021-09-03 14:54:41,833] [INFO] - Set user request expire time!
10:54:41.834
[2021-09-03 14:54:41,833] [INFO] - Set user request expire time!
10:54:41.837
[2021-09-03 14:54:41,837] [INFO] - Returning verification content to user!
10:54:41.837
[2021-09-03 14:54:41,837] [INFO] - Returning verification content to user!

When I check this specific application container I don't see any duplication.

filebeat.yml

# ========================= # ========================= # ==========================
# https://www.elastic.co/guide/en/beats/filebeat/current/filebeat-reference-yml.html
# ========================== Filebeat global options ===========================

# Registry data path. If a relative path is used, it is considered relative to the
# data path.
#filebeat.registry.path: ${path.data}/registry

# The permissions mask to apply on registry data, and meta files. The default
# value is 0600.  Must be a valid Unix-style file permissions mask expressed in
# octal notation.  This option is not supported on Windows.
#filebeat.registry.file_permissions: 0600

# The timeout value that controls when registry entries are written to disk
# (flushed). When an unwritten update exceeds this value, it triggers a write
# to disk. When flush is set to 0s, the registry is written to disk after each
# batch of events has been published successfully. The default value is 0s.
#filebeat.registry.flush: 0s


# Starting with Filebeat 7.0, the registry uses a new directory format to store
# Filebeat state. After you upgrade, Filebeat will automatically migrate a 6.x
# registry file to use the new directory format. If you changed
# filebeat.registry.path while upgrading, set filebeat.registry.migrate_file to
# point to the old registry file.
#filebeat.registry.migrate_file: ${path.data}/registry

# By default Ingest pipelines are not updated if a pipeline with the same ID
# already exists. If this option is enabled Filebeat overwrites pipelines
# everytime a new Elasticsearch connection is established.
#filebeat.overwrite_pipelines: false

# How long filebeat waits on shutdown for the publisher to finish.
# Default is 0, not waiting.
#filebeat.shutdown_timeout: 0

# Enable filebeat config reloading
filebeat.config:
  inputs:
    enabled: true
    path: inputs.d/*.yml
    reload.enabled: true
    reload.period: 10s
  modules:
    enabled: true
    path: modules.d/*.yml
    reload.enabled: true
    reload.period: 10s

# =========================== Filebeat autodiscover ============================

# Autodiscover allows you to detect changes in the system and spawn new modules
# or inputs as they happen.

filebeat.autodiscover:
 providers:
   - type: kubernetes
     hints.enabled: true

processors:
  - add_cloud_metadata:
      cloud.id: ${ELASTIC_CLOUD_ID}
      cloud.auth: ${ELASTIC_CLOUD_AUTH}

# ---------------------------- Elasticsearch Output ----------------------------
output.elasticsearch:
  # Boolean flag to enable or disable the output module.
  enabled: true

  # Array of hosts to connect to.
  # Scheme and port can be left out and will be set to the default (http and 9200)
  # In case you specify and additional path, the scheme is required: http://localhost:9200/path
  # IPv6 addresses should always be defined as: https://[2001:db8::1]:9200
  hosts: ['${ELASTICSEARCH_HOST:elasticsearch}:${ELASTICSEARCH_PORT:9200}']

  # Set gzip compression level.
  #compression_level: 0

  # Configure escaping HTML symbols in strings.
  #escape_html: false

  # Protocol - either `http` (default) or `https`.
  protocol: "http"

  # Authentication credentials - either API key or username/password.
  #api_key: "id:api_key"
  username: ${ELASTICSEARCH_USERNAME}
  password: ${ELASTICSEARCH_PASSWORD}

  # Dictionary of HTTP parameters to pass within the URL with index operations.
  #parameters:
    #param1: value1
    #param2: value2

  # Number of workers per Elasticsearch host.
  #worker: 1

  # Optional index name. The default is "filebeat" plus date
  # and generates [filebeat-]YYYY.MM.DD keys.
  # In case you modify this pattern you must update setup.template.name and setup.template.pattern accordingly.
  #index: "filebeat-%{[agent.version]}-%{+yyyy.MM.dd}"

  # Optional ingest node pipeline. By default no pipeline will be used.
  #pipeline: ""

  # Optional HTTP path
  #path: "/elasticsearch"

  # The number of times a particular Elasticsearch index operation is attempted. If
  # the indexing operation doesn't succeed after this many retries, the events are
  # dropped. The default is 3.
  max_retries: 3

  # The maximum number of events to bulk in a single Elasticsearch bulk API index request.
  # The default is 50.
  bulk_max_size: 50

  # The number of seconds to wait before trying to reconnect to Elasticsearch
  # after a network error. After waiting backoff.init seconds, the Beat
  # tries to reconnect. If the attempt fails, the backoff timer is increased
  # exponentially up to backoff.max. After a successful connection, the backoff
  # timer is reset. The default is 1s.
  backoff.init: 1s

  # The maximum number of seconds to wait before attempting to connect to
  # Elasticsearch after a network error. The default is 60s.
  backoff.max: 60s

  # Configure HTTP request timeout before failing a request to Elasticsearch.
  timeout: 90

# ====================== Index Lifecycle Management (ILM) ======================

# Configure index lifecycle management (ILM). These settings create a write
# alias and add additional settings to the index template. When ILM is enabled,
# output.elasticsearch.index is ignored, and the write alias is used to set the
# index name.

# Enable ILM support. Valid values are true, false, and auto. When set to auto
# (the default), the Beat uses index lifecycle management when it connects to a
# cluster that supports ILM; otherwise, it creates daily indices.
setup.ilm.enabled: true

# Set the prefix used in the index lifecycle write alias name. The default alias
# name is 'filebeat-%{[agent.version]}'.
setup.ilm.rollover_alias: 'filebeat-%{[agent.version]}'

# Set the rollover index pattern. The default is "%{now/d}-000001".
setup.ilm.pattern: "{now/d}-000001"

# Set the lifecycle policy name. The default policy name is
# 'beatname'.
setup.ilm.policy_name: "filebeat-rollover-7-days"

# The path to a JSON file that contains a lifecycle policy configuration. Used
# to load your own lifecycle policy.
setup.ilm.policy_file: /usr/share/filebeat/policy/ilm-policy.json

# Disable the check for an existing lifecycle policy. The default is true. If
# you disable this check, set setup.ilm.overwrite: true so the lifecycle policy
# can be installed.
setup.ilm.check_exists: true

# Overwrite the lifecycle policy at startup. The default is false.
setup.ilm.overwrite: false

# ============================= X-Pack Monitoring ==============================
# Filebeat can export internal metrics to a central Elasticsearch monitoring
# cluster.  This requires xpack monitoring to be enabled in Elasticsearch.  The
# reporting is disabled by default.

# Set to true to enable the monitoring reporter.
monitoring.enabled: true

# Sets the UUID of the Elasticsearch cluster under which monitoring data for this
# Filebeat instance will appear in the Stack Monitoring UI. If output.elasticsearch
# is enabled, the UUID is derived from the Elasticsearch cluster referenced by output.elasticsearch.
#monitoring.cluster_uuid:

# ================================== Logging ===================================

# There are four options for the log output: file, stderr, syslog, eventlog
# The file output is the default.

# Sets log level. The default log level is info.
# Available log levels are: error, warning, info, debug
logging.level: info

# Enable debug output for selected components. To enable all selectors use ["*"]
# Other available selectors are "beat", "publisher", "service"
# Multiple selectors can be chained.
#logging.selectors: [ ]

# Send all logging output to stderr. The default is false.
#logging.to_stderr: false

# Send all logging output to syslog. The default is false.
#logging.to_syslog: false

# Send all logging output to Windows Event Logs. The default is false.
#logging.to_eventlog: false

# If enabled, Filebeat periodically logs its internal metrics that have changed
# in the last period. For each metric that changed, the delta from the value at
# the beginning of the period is logged. Also, the total values for
# all non-zero internal metrics are logged on shutdown. The default is true.
logging.metrics.enabled: true

# The period after which to log the internal metrics. The default is 30s.
logging.metrics.period: 30s

# Logging to rotating files. Set logging.to_files to false to disable logging to
# files.
logging.to_files: false
# logging.files:
  # Configure the path where the logs are written. The default is the logs directory
  # under the home path (the binary location).
  #path: /var/log/filebeat

  # The name of the files where the logs are written to.
  #name: filebeat

  # Configure log file size limit. If limit is reached, log file will be
  # automatically rotated
  #rotateeverybytes: 10485760 # = 10MB

  # Number of rotated log files to keep. Oldest files will be deleted first.
  #keepfiles: 7

  # The permissions mask to apply when rotating log files. The default value is 0600.
  # Must be a valid Unix-style file permissions mask expressed in octal notation.
  #permissions: 0600

  # Enable log file rotation on time intervals in addition to size-based rotation.
  # Intervals must be at least 1s. Values of 1m, 1h, 24h, 7*24h, 30*24h, and 365*24h
  # are boundary-aligned with minutes, hours, days, weeks, months, and years as
  # reported by the local system clock. All other intervals are calculated from the
  # Unix epoch. Defaults to disabled.
  #interval: 0

  # Rotate existing logs on startup rather than appending to the existing
  # file. Defaults to true.
  # rotateonstartup: true

  # Rotated files are either suffixed with a number e.g. filebeat.1 when
  # renamed during rotation. Or when set to date, the date is added to
  # the end of the file. On rotation a new file is created, older files are untouched.
  #suffix: count

# Set to true to log messages in JSON format.
#logging.json: false

and

kubernetes.yml

# https://www.elastic.co/guide/en/beats/filebeat/current/filebeat-input-container.html
- type: container
  paths:
    - '/var/lib/docker/containers/*/*.log'
  close_inactive: 60h
  clean_removed: true
  close_timeout: 90m
  clean_inactive: 120m
  ignore_older: 100m
  harvester_limit: 40
  processors:
    - add_kubernetes_metadata:
        in_cluster: true
    - add_id: ~

How can I fix this ?

Version
Kibana : 7.13.1
ES : 7.13.4
Filebeat : 7.14.0

weltenwort · September 6, 2021, 4:07pm

Hi @rp346,

is it possible that filebeat is reading the logs twice because they're configured both in inputs.d and modules.d?

system · October 4, 2021, 4:08pm

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Duplicate logs in kibana Kibana	3	1357	May 9, 2022
How to prevent duplicate log Logstash	5	333	February 1, 2023
Duplicate logs In Kibana Logstash	3	2418	February 3, 2020
How to avoid duplicate log? Logstash	4	4943	June 14, 2018
Duplicate logs Elasticsearch	14	6570	July 10, 2018

Kibana Duplicate data

Related topics