Does filebeat cache?


(Sijis Aviles) #1

We bake our ami in AWS packer which includes filebeat.

However, we've noticed that new instances are showing the ip of of the instance used to create the ami (aka wrong ip) as shipper in logstash.

Currently, if we just restart filebeat, the shipper details are correct.

Is there a cache file that handles this which i'd have to remove?

Thanks,

Sijis


(ruflin) #2

It seems like if you boot up your new instance, filebeat is already running and not started. Is this intended? The shipper ip is loaded once during starting filebeat and reading the config and from then on cached as it assumes it will not change during running.


(Sijis Aviles) #3

The new instance is built from an ami which had filebeat installed. When the instance first boots, it should start filebeat (and it does).
So where is the cache location of this config? I think if i delete that cache file pre the ami 'sealing', then i should be OK.
My speculation is that somehow filebeat is reading the cache instead of the new hostname.

As an added note, i do not have a 'shipper' section in my filebeat.yml file.


(ruflin) #4

Which OS and Version are you using? I have to make some tests myself here. Filebeat itself does not have a cache location, but I have to check in detail where Golang reads the value from.


(Sijis Aviles) #5

@ruflin All systems are Fedora22.

On server (forward to logstash):

  • filebeat-1.0.0-1.x86_64

On logstash:

  • logstash-2.1.0-1.noarch

(Sijis Aviles) #6

These are the configs i'm using

filebeat config on server:

logging:
  level: info

  # enable file rotation with default configuration
  to_files: true

  # do not log to syslog
  to_syslog: false

  files:
    path: /var/log/filebeat
    name: filebeat.log
    keepfiles: 3


filebeat:

  # General filebeat configuration options
  #spool_size: 1024
  #idle_timeout: 5s
  registry_file: /var/lib/filebeat/registry
  config_dir: /etc/filebeat/conf.d


  prospectors:
    # Each - is a prospector. Below are the prospector specific configurations
    -
      paths:
        - /var/log/filebeat/*.log

      #encoding: plain
      type: log
      #ignore_older: 24h
      #scan_frequency: 10s
      #harvester_buffer_size: 16384
      #tail_on_rotate: false

output:
  logstash:
    enabled: true

    hosts: ["logstash:1001"]

    # index configures '@metadata.beat' field to be used by Logstash for
    # indexing. By Default the beat name is used (e.g. filebeat, topbeat, packetbeat)
    index: mybeat

logstash config:

input {
  beats {
    port => 1001
  }
}
output {
  stdout {
    codec => rubydebug
  }
}

(ruflin) #7

@sijis I did some tests on my side with fedora and the hostname change seems to be depend on timing and where / how do you change it. Which command do you use the change the hostname? At what stage is the hostname of the new machine set related to the start of filebeat?

For golang the file that seems to matter seems to be: /proc/sys/kernel/hostname

Here is the code reading it: https://github.com/golang/go/blob/master/src/os/sys_linux.go#L10

As you write after restarting filebeat it works as expected, I assume it has more to do with timing of setting the hostname. But be aware that I'm not a fedora expert :slight_smile:


(Steffen Siering) #8

@sijis can you post the output of 'ls /etc/rc<level>.d' ? Maybe filebeat is started before network/hostname is configured.


(Sijis Aviles) #9

@steffens I think there is an order issue going on.

This is the output of ls /etc/rc3.d/

[fedora@ip-xx-xx-xx-xx ~]$ ls /etc/rc3.d
K50netconsole  S10network  S50logstash  S95jexec  S98filebeat

That output seems a little misleading, as Fedora22 uses systemd. This is the output of that is the following:

[fedora@ip-xx-xx-xx-xx ~]$ systemctl list-dependencies
default.target
● ├─auditd.service
● ├─cloud-config.service
● ├─cloud-final.service
● ├─cloud-init-local.service
● ├─cloud-init.service
● ├─crond.service
● ├─dbus.service
● ├─filebeat.service
● ├─logstash.service
● ├─network.service
● ├─plymouth-quit-wait.service
● ├─plymouth-quit.service
● ├─sshd.service
● ├─systemd-ask-password-wall.path
● ├─systemd-logind.service
● ├─systemd-update-utmp-runlevel.service
● ├─systemd-user-sessions.service
● ├─tailon.service
● ├─basic.target
● │ ├─dnf-makecache.timer
● │ ├─fedora-autorelabel-mark.service
● │ ├─fedora-autorelabel.service
● │ ├─fedora-loadmodules.service
● │ ├─paths.target
● │ ├─slices.target
● │ │ ├─-.slice
● │ │ └─system.slice
● │ ├─sockets.target
● │ │ ├─dbus.socket
● │ │ ├─systemd-initctl.socket
● │ │ ├─systemd-journald-audit.socket
● │ │ ├─systemd-journald-dev-log.socket
● │ │ ├─systemd-journald.socket
● │ │ ├─systemd-shutdownd.socket
● │ │ ├─systemd-udevd-control.socket
● │ │ └─systemd-udevd-kernel.socket
● │ ├─sysinit.target
● │ │ ├─dev-hugepages.mount
● │ │ ├─dev-mqueue.mount
● │ │ ├─dracut-shutdown.service
● │ │ ├─kmod-static-nodes.service
● │ │ ├─ldconfig.service
● │ │ ├─plymouth-read-write.service
● │ │ ├─plymouth-start.service
● │ │ ├─proc-sys-fs-binfmt_misc.automount
● │ │ ├─sys-fs-fuse-connections.mount
● │ │ ├─sys-kernel-config.mount
● │ │ ├─sys-kernel-debug.mount
● │ │ ├─systemd-ask-password-console.path
● │ │ ├─systemd-binfmt.service
● │ │ ├─systemd-firstboot.service
● │ │ ├─systemd-hwdb-update.service
● │ │ ├─systemd-journal-catalog-update.service
● │ │ ├─systemd-journal-flush.service
● │ │ ├─systemd-journald.service
● │ │ ├─systemd-machine-id-commit.service
● │ │ ├─systemd-modules-load.service
● │ │ ├─systemd-random-seed.service
● │ │ ├─systemd-sysctl.service
● │ │ ├─systemd-sysusers.service
● │ │ ├─systemd-tmpfiles-setup-dev.service
● │ │ ├─systemd-tmpfiles-setup.service
● │ │ ├─systemd-udev-trigger.service
● │ │ ├─systemd-udevd.service
● │ │ ├─systemd-update-done.service
● │ │ ├─systemd-update-utmp.service
● │ │ ├─systemd-vconsole-setup.service
● │ │ ├─cryptsetup.target
● │ │ ├─local-fs.target
● │ │ │ ├─-.mount
● │ │ │ ├─fedora-import-state.service
● │ │ │ ├─fedora-readonly.service
● │ │ │ ├─systemd-fsck-root.service
● │ │ │ ├─systemd-remount-fs.service
● │ │ │ └─tmp.mount
● │ │ └─swap.target
● │ └─timers.target
● │   └─systemd-tmpfiles-clean.timer
● ├─getty.target
● │ ├─getty@tty1.service
● │ └─serial-getty@ttyS0.service
● └─remote-fs.target

Ahh, i think we are on to something. I notice filebeat runs before network.service.


(Steffen Siering) #10

Yep. filebeat and logstash should be among the last processes to start, not the first.

Is a service file available for filebeat/logstash? Can you add network.service as requirement for filebeat/logstash?


(Sijis Aviles) #11

I would agree that those service should start much later in bootup sequence.

Is a service file available for filebeat/logstash?

There is no filebeat.service or logstash.service file. I installed these from official elastic repositories.

Can you add network.service as requirement for filebeat/logstash?

Since there's no service file, i can't do it through systemd. There is probably a way to do is via the initd method. i'd have to research it a bit as i don't know it offhand.

My first guess, is that i'd have to modify these lines:

# Required-Start:    $remote_fs $syslog
# Required-Stop:     $remote_fs $syslog

This is what the top of the the startup scripts for both logstash and filebeat, so you see what I have.

[fedora@ip-xx-xx-xx-xx init.d]$ ls
filebeat  functions  jexec  logstash  netconsole  network  README
[fedora@ip-xx-xx-xx-xx init.d]$ head filebeat logstash -n17
==> filebeat <==
#!/bin/bash
#
# filebeat          filebeat shipper
#
# chkconfig: 2345 98 02
#

### BEGIN INIT INFO
# Provides:          filebeat
# Required-Start:    $local_fs $network $syslog
# Required-Stop:     $local_fs $network $syslog
# Default-Start:     2 3 4 5
# Default-Stop:      0 1 6
# Short-Description: Sends log files to Logstash or directly to Elasticsearch.
# Description:       filebeat is a shipper part of the Elastic Beats 
#                     family. Please see: https://www.elastic.co/products/beats
### END INIT INFO

==> logstash <==
#!/bin/sh
# Init script for logstash
# Maintained by Elasticsearch
# Generated by pleaserun.
# Implemented based on LSB Core 3.1:
#   * Sections: 20.2, 20.3
#
### BEGIN INIT INFO
# Provides:          logstash
# Required-Start:    $remote_fs $syslog
# Required-Stop:     $remote_fs $syslog
# Default-Start:     2 3 4 5
# Default-Stop:      0 1 6
# Short-Description:
# Description:        Starts Logstash as a daemon.
### END INIT INFO

(Steffen Siering) #12

I'm not really sure about the order of list-dependencies. Can you also try

$ systemd list-dependencies filebeat.service

I see some more services: cloud-config, cloud-final, cloud-init-local, cloud-init . I don't know amazon ami well, but what's the purpose of these services? Maybe any of these changes the hostname?


(Sijis Aviles) #13

I do believe cloud-init stuff does change it. We don't set or change instances to any specific name or anything and just let ec2 dhcp basically handle it.

I'll provide the output of list-dependencies shortly.

Sijis


(Sijis Aviles) #14

As an added note, cloud-init is needed for user-data stuff in ec2, that's why its there.

This is the output request:

[fedora@ip-xx-xx-xx-xx init.d]$ systemctl list-dependencies filebeat.service
filebeat.service
● ├─system.slice
● ├─basic.target
● │ ├─dnf-makecache.timer
● │ ├─fedora-autorelabel-mark.service
● │ ├─fedora-autorelabel.service
● │ ├─fedora-loadmodules.service
● │ ├─paths.target
● │ ├─slices.target
● │ │ ├─-.slice
● │ │ └─system.slice
● │ ├─sockets.target
● │ │ ├─dbus.socket
● │ │ ├─systemd-initctl.socket
● │ │ ├─systemd-journald-audit.socket
● │ │ ├─systemd-journald-dev-log.socket
● │ │ ├─systemd-journald.socket
● │ │ ├─systemd-shutdownd.socket
● │ │ ├─systemd-udevd-control.socket
● │ │ └─systemd-udevd-kernel.socket
● │ ├─sysinit.target
● │ │ ├─dev-hugepages.mount
● │ │ ├─dev-mqueue.mount
● │ │ ├─dracut-shutdown.service
● │ │ ├─kmod-static-nodes.service
● │ │ ├─ldconfig.service
● │ │ ├─plymouth-read-write.service
● │ │ ├─plymouth-start.service
● │ │ ├─proc-sys-fs-binfmt_misc.automount
● │ │ ├─sys-fs-fuse-connections.mount
● │ │ ├─sys-kernel-config.mount
● │ │ ├─sys-kernel-debug.mount
● │ │ ├─systemd-ask-password-console.path
● │ │ ├─systemd-binfmt.service
● │ │ ├─systemd-firstboot.service
● │ │ ├─systemd-hwdb-update.service
● │ │ ├─systemd-journal-catalog-update.service
● │ │ ├─systemd-journal-flush.service
● │ │ ├─systemd-journald.service
● │ │ ├─systemd-machine-id-commit.service
● │ │ ├─systemd-modules-load.service
● │ │ ├─systemd-random-seed.service
● │ │ ├─systemd-sysctl.service
● │ │ ├─systemd-sysusers.service
● │ │ ├─systemd-tmpfiles-setup-dev.service
● │ │ ├─systemd-tmpfiles-setup.service
● │ │ ├─systemd-udev-trigger.service
● │ │ ├─systemd-udevd.service
● │ │ ├─systemd-update-done.service
● │ │ ├─systemd-update-utmp.service
● │ │ ├─systemd-vconsole-setup.service
● │ │ ├─cryptsetup.target
● │ │ ├─local-fs.target
● │ │ │ ├─-.mount
● │ │ │ ├─fedora-import-state.service
● │ │ │ ├─fedora-readonly.service
● │ │ │ ├─systemd-fsck-root.service
● │ │ │ ├─systemd-remount-fs.service
● │ │ │ └─tmp.mount
● │ │ └─swap.target
● │ └─timers.target
● │   └─systemd-tmpfiles-clean.timer
● └─network-online.target

(Sijis Aviles) #15

If this helps to reproduce, i'm doing the following.
i'm using the following ami (ami-81698dea) [us-east-1] + and installing filebeat sudo dnf install filebeat from elastic repos

[fedora@ip-xx-xx-xx-xx init.d]$ cat /etc/yum.repos.d/filebeat.repo 
[filebeat]
name=filebeat repository
baseurl=http://packages.elasticsearch.org/beats/yum/el/x86_64/
gpgcheck=0
gpgkey=http://packages.elasticsearch.org/GPG-KEY-elasticsearch
enabled=1

Then i create an ami from those steps using packer.

Sijis


(Steffen Siering) #16

Reading amzon linux ami basics cloud-init will set the hostname. with systemd potentially parallelizing service startup, cloud-init must be a dependency for filebeat I would say.


(Sijis Aviles) #17

Yeah, that would seem like what would need to be done.
The packaging for filebeat doesn't use a systemd script to launch though. i'll see if i can hack something with either the provided script in /init.d or a custom service file.


(Steffen Siering) #18

Cool, looking forward to hear about your experience. Unfortunately I don't know systemd or amazon ami this well. Maybe we can integrate your solution into beats-packer.


(Sijis Aviles) #19

@steffens I ended up adding the following contents below in /etc/systemd/system/filebeat.service and it started working just fine.

[Unit] Description=Filebeat Service Requires=cloud-final.service After=syslog.target network.target cloud-final.service

[Service]
Type=simple
ExecStart=/usr/bin/filebeat -c /etc/filebeat/filebeat.yml

[Install]
WantedBy=multi-user.target

I did try to incorporate the changes to the beats-packer repo but I wasn't able to fully understand the process thus I did not create a PR. However, I do think, long term, there will have to be a feature implemented in fpm that allows systemd scripts from being used for rpm based systems. In fpm's master branch, that feature appears to be available for debian based systems.


(Steffen Siering) #20

Did create a new github issue.

in platforms directory building for different platforms is scripted. Every 'platform' has a build.sh which is responsible to apply the local script templates to current build parameters and run final script (run-&ltid>.sh).

Maybe for AMI support we will need an extra package or some checks if cloud-init is installed to build service file from template. Don't know much about fpm/rpm myself.