Questions about Filebeat 7.x to 8.x upgrade regarding data streams

Hi,

I have a couple questions regarding the 7.x to 8.x upgrade for Beats/Filebeat.

In Beats/Filebeat 8.x, data streams are used instead of indexes when ingesting data to Elasticsearch.

Question: does this apply only to the Elasticsearch output? (Configure the Elasticsearch output | Beats)

My Filebeat outputs to Logstash (which, in turn, outputs to Elasticsearch). It's unclear to me whether the 'data stream change' affects that case.

Upgrade | Beats Platform Reference [8.19] | Elastic says:

Starting in version 8.0, the default Elasticsearch index templates configure data streams instead of traditional Elasticsearch indices. [...] To use data streams, load the default index templates

... followed by the command beatname setup --index-management.

Questions:

  • Should beatname be replaced by filebeat (in my case), or is beatname an existent binary?
  • Does this command need to be run on a single Filebeat instance? (I would assume so, if all it does is change an index template on the cluster.)

Hi @NominaSumpta

This is a pretty good guide .... This Should work for later versions of 8.x

To answer your specific questions above

filebeat

The setup commands should be run once each time you have a new beat configuration, version , module etc

Pro.tip..Also you should just run

filebeat setup -e which sets up all assets instead of trying to setup separate assets like index-management you will need to have that setup filebrat pointed at Kibana as well

Hi Stephen,

That actually answers none of my concrete questions…

Hmmmm I specifically answered the following...

To answer your specific questions above

filebeat

The setup commands should be run once each time you have a new beat configuration, version , module etc

Pro.tip..Also you should just run

filebeat setup -e which sets up all assets instead of trying to setup separate assets like index-management you will need to have that setup filebrat pointed at Kibana as well

With respect to data streams....

When you use beats -> logstash -> elasticsearch in 8.x

Whether data streams are used is defined by the logstash elasticsearch output configuration not the beats configuration.

Do you WANT to used datastreams (which I would recommend) or not?

This Winlogbeat example shows a good configurations example

This should create data stream with beats output > logstash input > logstash elasticsearch output -> elasticsearch

input {
  beats {
    port => 5044
  }
}

output {
  if [@metadata][pipeline] {
    elasticsearch {
      hosts => "https://localhost:9200"
      manage_template => false
      index => "%{[@metadata][beat]}-%{[@metadata][version]}"
      action => "create"  <<< IMPORTANT!!
      pipeline => "%{[@metadata][pipeline]}"
      user => "elastic"
      password => "secret"
    }
  } else {
    elasticsearch {
      hosts => "https://localhost:9200"
      manage_template => false
      index => "%{[@metadata][beat]}-%{[@metadata][version]}"
      action => "create" <<< IMPORTANT!!
      user => "elastic"
      password => "secret"
    }
  }
}

I would recommend setting up and testing

BTW I tested 8.19.2

I ran this config

I ran filebeat setup -e first pointing to Kibana and Elasticsearch then changed to Logstash output


filebeat.inputs:

- type: filestream

  # Unique ID among all inputs, an ID is required.
  id: my-filestream-id

  # Change to true to enable this input configuration.
  enabled: true

  # Paths that should be crawled and fetched. Glob based paths.
  paths:
    - /var/log/*.log

filebeat.config.modules:
  # Glob pattern for configuration loading
  path: ${path.config}/modules.d/*.yml

  # Set to true to enable config reloading
  reload.enabled: false

  # Period on which files under path should be checked for changes
  #reload.period: 10s

# ======================= Elasticsearch template setting =======================

setup.template.settings:
  index.number_of_shards: 1
  #index.codec: best_compression
  #_source.enabled: false


# setup.kibana:

# ---------------------------- Elasticsearch Output ----------------------------
# output.elasticsearch:
#   # Array of hosts to connect to.
#   hosts: ["localhost:9200"]

# ------------------------------ Logstash Output -------------------------------
output.logstash:
  #The Logstash hosts
  hosts: ["localhost:5044"]

# ================================= Processors =================================
processors:
  - add_host_metadata:
      when.not.contains.tags: forwarded
  - add_cloud_metadata: ~
  - add_docker_metadata: ~
  - add_kubernetes_metadata: ~

I ran the following Logtash.conf

# Sample Logstash configuration for creating a simple
# Beats -> Logstash -> Elasticsearch pipeline.

input {
  beats {
    port => 5044
  }
}

output {
  if [@metadata][pipeline] {
    elasticsearch {
      hosts => "http://localhost:9200"
      manage_template => false
      index => "%{[@metadata][beat]}-%{[@metadata][version]}"
      action => "create" 
      pipeline => "%{[@metadata][pipeline]}"
      user => "elastic"
      password => "secret"
    }
  } else {
    elasticsearch {
      hosts => "http://localhost:9200"
      manage_template => false
      index => "%{[@metadata][beat]}-%{[@metadata][version]}"
      action => "create" 
      user => "elastic"
      password => "secret"
    }
  }
}

Created a data stream