Filebeat inconsistent with date-based index name when input type is "container"

Hello, all.

I was wondering if anyone has ever stumbled across this and if there are any workarounds (I've been bashing my head against it for entirely too long now...).

I have multiple servers using filebeat to ship logs to a central Elasticsearch instance.
All filebeat installations are elastic.co sourced debian packages and not docker containers.
Logs are successfully sent/received with all expected metadata from /var/log/.log via a filestream input and /var/lib/docker/containers//*.log via container input (and add_docker_metadata processor).
All systems and docker containers are configured for US/Central time and not UTC.

The problem is that the index name filebeat writes to (log-%{+yyyy.MM.dd}) is based on local time for the system files and UTC for the docker container files.
This results in logs for the same time being written to two different indexes for the hours that local and UTC dates differ. For example:

 {
  "_index": "log-2022.12.26",
  "_type": "_doc",
  "_id": "1UKU3dFvXGM",
  "_score": 6.228486318473047,
  "@timestamp": "2022-12-27T05:59:54.296Z",
--SNIP--
    "input": {
      "type": "filestream"
    },
    "log": {
      "file": {
        "path": "/var/log/syslog"
      },
      "offset": 94772
    },
--SNIP--
  }
}
{
  "_index": "log-2022.12.27",
  "_type": "_doc",
  "_id": "1UKTYk0pzlC",
  "_score": 6.611061873463312,
  "@timestamp": "2022-12-27T05:58:44.123000064Z",
--SNIP--
    "input": {
      "type": "container"
    },
    "log": {
      "file": {
        "path": "/var/lib/docker/containers/3f86dbb851588a7129890a92f885d3de0e7637252887f936a62359a46bc73b09/3f86dbb851588a7129890a92f885d3de0e7637252887f936a62359a46bc73b09-json.log"
      },
      "offset": 44300259
    },
--SNIP--
  }
}

I'm running filebeat 8.5.3 across the board and see the results in the latest version of Elasticsearch and Zincsearch.
Here's my filebeat.yml:

filebeat.inputs:
- type: filestream
  id: syslogs
  enabled: true
  paths:
    - /var/log/*.log
    - /var/log/syslog
- type: container
  paths:
    - /var/lib/docker/containers/*/*.log
filebeat.config.modules:
  path: ${path.config}/modules.d/*.yml
  reload.enabled: false
output.elasticsearch:
  hosts: ["REDACTED"]
  path: "/es/"
  index: "log-%{+yyyy.MM.dd}"
  username: "REDACTED"
  password: "REDACTED"
processors:
  - add_host_metadata:
      when.not.contains.tags: forwarded
  - add_docker_metadata:
      host: "unix:///var/run/docker.sock"
setup.ilm.enabled: false
setup.template.name: "log"
setup.template.pattern: "log-*"
logging.level: error
logging.to_syslog: false
logging.to_files: false

Any thoughts, suggestions or insights would be greatly appreciated.

Thanks,

Matthew

Hi @GaijinSystems Welcome to the community.

Have you have shelled into the actul running filebeat docker container and examined that correct Date / Timezone set. Docker defaults to UTC unless specifically set. Just because the host it is running on is set to a timezone / docker will not automatically inherit this as far as I know.

My host

hyperion:tmp sbrown$ date
Wed Dec 28 10:01:00 PST 2022

My Filebeat running in Docker on that same host

sh-5.0$ date
Wed Dec 28 18:01:05 UTC 2022
sh-5.0$ 

Filebeat will use the timezone of the host or the container it is directly running on.

Even if you added the locale... that is applied to each event not the naming of the index

What you are seeing indicates the container is still UTC to me.

Curious even if this a bug or something else why this matters... Your searches / alerts / dashboards should be using an index pattern log-* so whether an actual log entry is in one or the other should be of little consequences.

There is no guarantee that a log entry will show up always in the matching index name ... There are plenty of other reasons why this can happen... Catching up on logs ... Ingestion lag.. (these are not your issue but) just pointing out there is no guarantee... And hopefully you are not building logic / workflow around that assumption.

When you move to 8.x and data streams this is all abstracted and you won't even see it...

Thank you for the responses, stephenb.

Just to clarify, filebeat is installed directly on the host and is not running in a container.
The system itself and all containers are set to CST.
(hostnames in the following output have been tweaked for clarity)

root@host:~# hostname ; date ; echo ; docker exec -it container hostname ; docker exec -it container date
host
Wed Dec 28 12:50:46 CST 2022

container
Wed Dec 28 12:50:46 CST 2022

When I first ran into this situation, I was running filebeat in a docker container as well.
I found this github issue when looking for remedies: https://github.com/elastic/beats/issues/31329
The workarounds mentioned the comments fixed the actual time zone in the container--but didn't solve the index naming problem.
My next step was to install filebeat locally to eliminate the docker environment as a contributing factor.

Admittedly, this seems to have less of an impact when shipping to elasticsearch with a more common configuration; but when ilm is disabled or shipping to zincsearch, CPU and IO spike during the window of "competing" indexes.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.