Filebeat to elastic/cloud not using defined index in yml config

tymercer · February 22, 2023, 7:19pm

First, I am new to all of this and have next to no knowledge of how all of our systems were setup originally as I took this over when the person in charge of it left the company.

ELK 6.8.12, Filebeats 6.4 (yes, old, it will be upgraded after the move)
I am in the process of moving from an on-prem full ELK stack to Elastic hosted services. We are closing all of our on-prem sites down.
However, elastic cloud apparently drops Kafka and Logstash from the stack and I have to make changes to filebeats to point directly to elasticsearch.

It does not seem to be working properly.
With our current full stack setup the logs are like this, which I think is normal.
server -> kafka -> logstash -> elasticsearch
Each server has filebeats running, kafka has 3 servers only running kafka, logstash has 3 servers only running logstash, elasticsearch has 6 servers.. 3 running elasticsearch and kibana (elastic clients) and the remaining 3 are elasticsearch only (data nodes)
Elasticsearch generates index files on named "logstash-TOPIC-YEAR.NUM" and all of the servers get their logs pushed into the single index of their topic (gk, iis, etc)

The servers have filebeats installed with a pretty simple config, this is an example of one of them.

####------   filebeat.yml
filebeat.config.modules:
  enabled: true
  path: ${path.config}/modules.d/*.yml

filebeat.inputs:
- type: log
  enabled: true
  paths:
    - /opt/logs/gk-app.log
  multiline.pattern: ^\[
  multiline.negate: true
  multiline.match: after
  fields_under_root: true
  fields:
   type: WLP
   input_type: "log"
logging.level: error
output.kafka:
  hosts: ["kafka01:9092", "kafka02:9092", "kafka03:9092"]
  topic: gk-logstash
 partition.round_robin:
    reachable_only: false
  required_acks: 1
  compression: gzip
  max_message_bytes: 1000000
  version: 0.10.2.0

Kafka doesn't appear to have anything really going on with it...

####------   server.properties file
broker.id=1
advertised.listeners=PLAINTEXT://172.1.1.3:9092
listeners=PLAINTEXT://172.1.1.3:9092
num.network.threads=50
num.io.threads=30
socket.send.buffer.bytes=102400
socket.receive.buffer.bytes=102400
socket.request.max.bytes=104857600
log.dirs=/opt/kafka_data1/logs
num.recovery.threads.per.data.dir=1
log.retention.hours=168
log.segment.bytes=1073741824
log.retention.bytes=10073741824
log.retention.check.interval.ms=30000
zookeeper.connect=localhost:2181
delete.topic.enable=true
num.partitions=12
default.replication.factor=2
zookeeper.connection.timeout.ms=600000
offsets.retention.minutes=43200

####------   zookeeper.properties file
initLimit=5
syncLimit=2
maxClientCnxns=0
clientPort=2181
maxClientCnxns=0
server.1=172.1.1.3:2888:3888
server.2=172.1.1.4:2888:3888
server.3=172.1.1.5:2888:3888
dataDir=/opt/kafka_data1/gk_data
autopurge.snapRetainCount=5
autopurge.purgeInterval=12

I don't see any other configuration files or settings anywhere on these 3 kafka servers.

This is what is being used to start logstash

Logstash --path.settings /etc/logstash/ -r -f /opt/logstash-parsers/parsers/dl-logstash/ -l /opt/share/logs/dl-logstash/ -w 1
Logstash --path.settings /etc/logstash/ -r -f /opt/logstash-parsers/parsers/ti-logstash/ -l /opt/share/logs/ti-logstash/ -w 1
etc,etc for each server type we are running filebeats on.

Logstash servers have this in the /etc/logstash/ directory
00-kafka-input.conf
10-IHS-filter.conf
10-WAS-filter.conf
20-elasticsearch-output.conf
logstash.yml
and other config files for startup, java and log4j

This is one of the files, the others are similar with their own configs but nothing sticking out that points to anything other than the zookeeper1/2/3

####------   00-kafka-input.conf
input {
  kafka {
    bootstrap_servers => "${ZOOKEEPER1},${ZOOKEEPER2},${ZOOKEEPER3}"
    topics => []
    group_id =>
    codec => 'json'
    session_timeout_ms => '30000'
    max_poll_records => '250'
    consumer_threads => 4
    decorate_events => true
  }
}
filter {
  mutate {
    copy => { '[@metadata][kafka]'' => '[metadata][kafka]'' }
  }

####------ logstash.yml
node.name: ${CONFIG}_${HOST}
pipeline.id: ${CONFIG}-${HOST}
path.data: /opt/logstash-parsers/${CONFIG}/
xpack.monitoring.elasticsearch.url: ["https://172.1.1.4:9200","https://172.1.1.3:9200","https://172.1.1.5:9200"]
xpack.monitoring.elasticsearch.username: logstash_user
xpack.monitoring.elasticsearch.password: logstash_password
xpack.monitoring.elasticsearch.ssl.ca: /etc/logstash/ca.pem

The Elasticsearch servers

#### /etc/elasticsearch
#	elasticsearch.keystore  
#	elasticsearch.yml  
#	jvm.options  
#	log4j2.properties  
#	role_mapping.yml  
#	roles.yml  
#	users  
#	users_roles
####------   
####------    elasticsearch.yml
####------   
cluster.name: elk_cluster_01
node.name: elstc01
node.master: false
node.data: false
node.ingest: false
network.bind_host: 0
network.host: [_ens160_]
discovery.zen.ping.unicast.hosts: ["elstd01", "elstd02", "elstd03"]
bootstrap.memory_lock: true
bootstrap.system_call_filter: false
xpack.security.enabled: true
xpack.security.transport.ssl.enabled: true
xpack.security.transport.ssl.verification_mode: certificate
xpack.ssl.key:                     /etc/elasticsearch/certs/key_elstc01.pem
xpack.ssl.certificate:             /etc/elasticsearch/certs/cert_elstc01.pem
xpack.ssl.certificate_authorities: [ "/etc/elasticsearch/certs/ca.pem" ]
xpack.monitoring.collection.enabled: true
xpack.security.http.ssl.enabled: true
xpack.security.http.ssl.key:  /etc/elasticsearch/certs/key_elstc01.pem
xpack.security.http.ssl.certificate: /etc/elasticsearch/certs/cert_elstc01.pem
xpack.security.http.ssl.certificate_authorities: [ "/etc/elasticsearch/certs/ca.pem" ]
xpack:
  security:
    authc:
        realms:
          native1:
                type: native
                order: 0
          ad1:
                type: active_directory
                order: 1
                domain_name: OURDOMAIN
                url: ldaps://dc01:636, ldaps://dc02:636
                ssl:
                  certificate_authorities: [ "/etc/elasticsearch/certs/ldap_ca.crt", "/etc/elasticsearch/certs/ca.pem" ]
                unmapped_groups_as_roles: true
                load_balance.type: round_robin
                follow_referrals: false
reindex.ssl.certificate_authorities: ["/etc/elasticsearch/certs/oldcluster.crt"]
reindex.ssl.verification_mode: certificate

Now for the problem...
We are migrating to Elastic on cloud, hosted by Elastic.
There is no kafka or logstash and we are needing to point the filebeats directly to elastic.
This is being done with the following changes made to the filebeat.yml file

filebeat.config.modules:
  enabled: true
  path: ${path.config}/modules.d/*.yml
filebeat.inputs:
- type: log
  enabled: true
  paths:
    - /opt/gk_logs/gatekeeper-app.log
  multiline.pattern: ^\[
  multiline.negate: true
  multiline.match: after
  fields_under_root: true
  fields:
   type: WLP
   input_type: "log"
logging.level: error

This does work, it will successfully hit the cloud instance and all of the -test passes, it even starts an index named FILEBEAT-6.4.2-YEAR-MON-DAY
But, if I change it to use output.elasticsearch... it does not work.
No errors are generated I can see, but there are no new indices created and I can't get it to use the correct name from our on-prem configuarations.
I've tried using the below, in various formats, to no avail.
The end goal is to have each server/topic generating the index as it was before moving to cloud

filbeat.yml
filebeat.config.modules:
  enabled: true
  path: ${path.config}/modules.d/*.yml
filebeat.inputs:
- type: log
  enabled: true
  paths:
    - /opt/gk_logs/gatekeeper-app.log
  multiline.pattern: ^\[
  multiline.negate: true
  multiline.match: after
  fields_under_root: true
  fields:
   type: WLP
   input_type: "log"
logging.level: error

cloud.id: "CLOUDID"
cloud.auth: "elastic_user:elastic_pass"

#output.elasticsearch.index: "gs-logstash-%{[agent.version]}"
setup.template.enabled: true
setup.template.name: "bgcs-logstash-%{[agent.version]}"
setup.template.pattern: "bgcs-logstash-%{[agent.version]}"

output.elasticsearch:
index: "bgcs-logstash-%{[agent.version]}"
topic: gk-logstash
partition.round_robin:
 reachable_only: false
 required_acks: 1
 compression: gzip

Any help would be greatly appreciated, I'm on a very short timeline as this was supposed to be shutdown last week.

leandrojmp · February 22, 2023, 8:11pm

What does the index name should look like? It is not clear, you didn't share what was your logstash output configuration.

Also, is your filebeat running? Your configuration has indentation and configuration errors and I'm not sure if your filebeat is running.

This is wrong, there is no topic or partition settings for the output.elasticearch and the index settings has the wrong indentation.

It should be something like this:

output.elasticsearch:
  index: "bgcs-logstash-%{[agent.version]}"

This would create an index named bgcs-logstash- followed by the version of the Filebeat.

Kafka is not an Elastic tool, it is a third-party tool. Logstash was never on Elastic Cloud, but you still can use it on your own infrastructure if you need.

You also didn't share what you have in your Logstash filters, depending on your filters there are some things that you cannot do with only filebeat and elasticsearch ingest pipelines.

leandrojmp · February 22, 2023, 8:23pm

Also, since you have security enabled with ldap integration this means that you have an on-premises license, have you already opened a ticket with Elastic support about this migration?

tymercer · February 22, 2023, 8:41pm

The filebeat.yml is doing output to kafka

output.kafka:
  hosts: ["kafka01:9092", "kafka02:9092", "kafka03:9092"]
  topic: gk-logstash

not sure where the logstash.yml would be that would indicate the current output as the filebeat.yml only has the kafka.output section.
There are some parser files on the kafka server, would it be in those?

The current indices are named

logstash-PIPELINE-date
So, something like logstash-gk-app-2023-2-22.1

That is what it needs to stay as, with the filebeat version

filebeat is running, it is still pushing data to the filebeat.xxxxx index file, just not the one listed in the index line.
The spacing is most likely just copy/paste errors, I used a YAML validator on the actual configs in use.

What output file are you looking for? there are some parser files that have configs in them, here is one of them

output{
  elasticsearch{
    document_type => "%{type}"
    hosts => ["${ELASTICSEARCH1}","${ELASTICSEARCH2}","${ELASTICSEARCH3}"]
    ssl => true
    cacert => "ca.der"
    index => "logstash-gk-%{+YYYY.MM.dd}"
    user => "${ELASTICUSER}"
    password => "${ELASTICPASSWORD}"
    manage_template => false
  }
}

I have talked to support and have had them help me with several processes, but there is a limit to what they can do because of the version and the need for professional services, which our management simply wont do.

Thanks

leandrojmp · February 22, 2023, 9:13pm

So, from the losgtash output you shared, this is how you are naming your indices.

index => "logstash-gk-%{+YYYY.MM.dd}"

Then, in filebeat you would have something like this:

output.elasticsearch:
  index: "logstash-gk-%{+yyyy.MM.dd}"

This should've work and create the index with this name. Can you share Filebeat logs when it was started?

You shared a lot of configurations and is really confusing trying to understand what you have working now.

Do you still have Filebeats sending data to Kafka or you changed everything to send directly to Elastic Cloud? Are the logstash still running?

What do you have in those files? Right now you have filebeat collecting logs on your servers, sending the data to kafka and logstash consumes the data from those kafka topics, logstash process the logs and apply some filters, all those filters will be lost when you start to send the data directly to elasticsearch and depending on what you do with this data this can break some things on your side.

Before migrating you need to check if the filters that logstash apply to your messages can be replicated in Elasticsearch using an Ingest Pipeline, not everything that Logstash can do is possible with an ingest pipeline.

The file with filters are something like these:

10-IHS-filter.conf
10-WAS-filter.conf

So you will also need to check this.

I know that you said that you are on short timeline, but this migration is basically rebuilding the way you collect and ingest data since you are removing two important tools from your data ingestion flow.

tymercer · February 22, 2023, 9:47pm

The config file passes yaml validation, so I'm guessing it's just a formatting error when posting it here.

The filebeat doesn't need the following in it?

setup.template.enabled: true
setup.template.name: gk-logstash-%{[agent.version]}
setup.template.pattern: gk-logstash-%{[agent.version]}

There are servers still sending data to kafka, but I have pulled 1 server from each 'topic' to change the configuration on to point to elastic cloud.

The only changes in the filebeat.yml file was removing the kafka.output and putting in the elasticsearch.output portion.

I changed the yml to the index pattern you have and this is the error I get starting filebeat now

pipeline/output.go:100  Failed to connect to backoff(elasticsearch(https://XXXXXXX.northeurope.azure.elastic-cloud.com:443)): Connection marked as failed because the onConnect callback failed: Error loading Elasticsearch template: error creating template instance: key not found

at least it's doing something different again.

leandrojmp · February 22, 2023, 9:55pm

These lines are used to create a template for your index, but if you are using the index name as logstash-gk-%{+yyyy.MM.dd}, then the pattern should be logstash-gk-* to match your index, also, this uses the fields described in the fields.yml from the filebeat directory.

Since you are using a very old filebeat version with Elastic Cloud, which is in the last Elastic version, I think that this may cause some issues, I would try to comment those lines and see if it at least can write in the correct index.

tymercer · February 22, 2023, 9:56pm

ok, looks like i got it working with the following

setup.template.name: logstash-gk-%{+yyyy.MM.dd}
setup.template.pattern: logstash-gk-%{+yyyy.MM.dd}

output.elasticsearch:
 index: logstash-gk-%{+yyyy.MM.dd}

Just fails when I add the beat.version tag..

Guessing this isn't the way to do it

"gk-logstash-%{[agent.version]}-%{[+yyyy.MM.dd]}"

If it continues pushing them to the logs that are currently the same name, so be it.. I can worry about updating them later.

Last problem is trying to figure out how to get the pipelines, users and roles out of this thing (can't be viewed/edited in kibana, greyed out and says they weren't created with centralized management)

Thanks for the help so far

leandrojmp · February 22, 2023, 10:10pm

You can't run Logstash pipelines on Elasticsearch, you will need to rewrite them as ingest pipelines in your Elastic Cloud cluster, this is what I mentioned in a previous answer, you will need to see if you can replicate all your filters from Logstash to Elastic Ingest pipelines.

The users and roles you can check in Kibana how they are mapped, also I'm not sure how you will integrate your local active directory with Elastic Cloud, exposing it on the internet is a security risk, so you should look on other methods like SSO if your company has it.

tymercer · February 22, 2023, 10:35pm

Already working on SSO/SAML...
I'm aware that we can't use logstash pipelines, but I wanted to export them to save them if they are actually needed and work on converting them over (if possible). I just have no way of finding out how to get them out of the system right now, unless they are buried in another configuration file somewhere.... and whoever set our clusters up need to be smacked around with a large trout for a while.. even the elastic support guy was 'wow... and this is working'

Mappings, tried looking for that, can't find anything.
There are no roles, user_mappings or such configuration files I have been able to find. Nothing in our AD says anything about Elastic, ELK or anything related to it we can find and doing an API query doesn't give us anything because that apparently only pulls stuff from the local account store.

I have the file that has the LDAP/AD realm defined but it's pretty empty. I'm just wondering if the old admin just gave everyone reader or something... who knows

system · March 23, 2023, 12:35am

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Filebeat yml config issue Beats	2	231	January 24, 2024
Filebeat not creating index in Kibana Beats filebeat	1	481	December 14, 2018
Filebeat on Windows not indexing into name on yml Beats filebeat	2	276	May 20, 2020
Filebeat index is not happening on the elastic search Beats filebeat	3	681	March 30, 2017
How to use the index specified in Filebeat in logstash.yml? Beats filebeat	3	439	September 18, 2018

Filebeat to elastic/cloud not using defined index in yml config

Related topics