Filebeat -> ElasticSearch for logrus running on k8s

We are trying to get started with Elasticsearch, but can't seem to get json from filebeats into elasticsearch nicely formatted. We are using logrus in go and typescript, which writes to container log files in kubernetes. We can read these logs in filebeat, but can't get them to load correctly in Elasticsearch

If we don't specify any json fields, the logs do surface in Kibana as

{
  "_index": "logs-app-api-test-2022.08.18",
  "_id": "M3G8soIBGrhScMFhfsad",
  "_version": 1,
  "_score": 0,
  "_source": {
    "@timestamp": "2022-08-18T20:54:27.599Z",
    "stream": "stdout",
    "message": "{\"action\":\"UpdateConfig\",\"event_id\":\"event-test-58a731e3-c318-4d47-9177-78eec15331e6\",\"status\":\"Success\",\"time\":\"2022-08-18T20:54:27.599Z\"}",
    "input": {
      "type": "container"
    },
    (many more k8s metadata fields) ...

We've tried various permutations of the fields like

 json.keys_under_root: true
 json.add_error_key: true
 json.message_key: "message"

and

- decode_json_fields:
    fields: ["message"]
    process_array: false
    max_depth: 3

to no great effect. What I'm looking for is to have a useful json object with my data, e.g.

action: "UpdateConfig",
status: "Success"

This is further made difficult by the fact that we have hetereogenuous logs coming from different applications running in k8s -we do not have just json logging or just line logging, but a mixture of both

I think the logstash equivalent is just

    filter {
      json {
        skip_on_invalid_json => true
        source => "message"
      }
    }

but there doesn't seem to be any combination of json settings that will do this for filebeat?

We are running filebeat 7.17.3 - the latest version supported in the elastic helm charts

Hi @Austin_ES_Questions Welcome to the community.

Can you share your filebeat.yml config and a couple sample lines from the log files themselves?

Snippets make it hard to debug.

Most likely you are using Old Syntax the syntax you show above is not the current / correct syntax for decoding ndjson

New sytnax is the parsers syntax see here

filebeat.inputs:
- type: filestream
  ...
  parsers:
    - ndjson:
        target: ""
        message_key: msg

Hi Stephen

One we tried is

    filebeat.yml: |
      filebeat.inputs:
      - type: container
        paths:
          - /var/log/containers/*
        exclude_files:
          - /var/log/containers/filebeat-*
          - /var/log/containers/fluent-bit-*
          - /var/log/containers/logstash-*
          - /var/log/containers/logdna*

        json.keys_under_root: true
        json.add_error_key: true

        processors:
        - add_kubernetes_metadata:
            host: ${NODE_NAME}
            matchers:
            - logs_path:
                logs_path: "/var/log/containers/"

      output.elasticsearch:
        hosts: ["https://<host>.us-west-2.aws.found.io:443"]
        username: "elastic"
        password: "<pw>"
        index: "logs-%{[kubernetes.container.name]:unknown}-%{[kubernetes.labels.app_kubernetes_io/name]:unknown}-%{+yyyy.MM.dd}"

      setup.ilm.enabled: false
      setup.template.name: "30-days-default"
      setup.template.pattern: "logs-*-*-*"

I'll try the parsers variant - thanks for the pointer!

1 Like

Hmmm interesting... let us know .. .the docs still show old syntax for containers... not sure if that is a doc bug or it is still old style syntax for the container input.

You certainly could try the filestream input as well

Hm, I think I still have something wrong - I've tried with both container and filestream, but it doesn't seem to be adding fields to the top-level in kinesis

    filebeat.yml: |
      filebeat.inputs:
      - type: filestream
        paths:
          - /var/log/containers/*
        exclude_files:
          - /var/log/containers/filebeat-*
          - /var/log/containers/fluent-bit-*
          - /var/log/containers/logstash-*
          - /var/log/containers/logdna*

        parsers:
          - ndjson:
              target: ""
              overwrite_keys: true
              message_key: message

        processors:
        - add_kubernetes_metadata:
            host: ${NODE_NAME}
            matchers:
            - logs_path:
                logs_path: "/var/log/containers/"

however, I was able to get top-level json running a test locally:

filebeat.inputs:
- type: filestream
  paths:
    - /var/log/aus*

  parsers:
    - ndjson:
        target: ""
        add_error_key: true
        overwrite_keys: true
        message_key: message

output.console:
  pretty: true

I also don't see any pertinent logging in the filebeat logs themselves, it looks like this:

INFO	[esclientleg]	eslegclient/connection.go:284	Attempting to connect to Elasticsearch version 8.3.3
25
2022-08-19T01:20:26.729Z	INFO	template/load.go:110	Template "30-days-default" already exists and will not be overwritten.
24
2022-08-19T01:20:26.729Z	INFO	[index-management]	idxmgmt/std.go:297	Loaded index template.
23
2022-08-19T01:20:26.732Z	INFO	[publisher_pipeline_output]	pipeline/output.go:151	Connection to backoff(elasticsearch(https://9884b8f3246148afbfcbe768ab9374cc.us-west-2.aws.found.io:443)) established
22
2022-08-19T01:20:35.687Z	INFO	[input.harvester]	log/harvester.go:309	Harvester started for paths: [/var/log/containers/*]	{"input_id": "dfeed8c2-3453-41bd-90a1-da7fd7b26a5e", "source": "/var/log/containers/datadog-ccvm2_datadog_agent-666232d7d1237ebba9fae9b381bb7ad80bd99c196081709e7e0d6df30d7a658b.log", "state_id": "native::34751240-66305", "finished": false, "os_id": "34751240-66305", "old_source": "/var/log/containers/datadog-ccvm2_datadog_agent-666232d7d1237ebba9fae9b381bb7ad80bd99c196081709e7e0d6df30d7a658b.log", "old_finished": true, "old_os_id": "34751240-66305", "harvester_id": "984425ab-5db9-4256-b3d9-5a421a014fb3"}
21
2022-08-19T01:20:55.593Z	INFO	[monitoring]	log/log.go:184	Non-zero metrics in the last 30s	{"monitoring": {"metrics": {"beat":{"cgroup":{"cpu":{"cfs":{"period":{"us":100000},"quota":{"us":40000}},"id":"/","stats":{"periods":53,"throttled":{"ns":1163381779,"periods":10}}},"cpuacct":{"id":"/","total":{"ns":697481102}},"memory":{"id":"/","mem":{"limit":{"bytes":209715200},"usage":{"bytes":54308864}}}},"cpu":{"system":{"ticks":60,"time":{"ms":61}},"total":{"ticks":280,"time":{"ms":284},"value":280},"user":{"ticks":220,"time":{"ms":223}}},"handles":{"limit":{"hard":1048576,"soft":1048576},"open":15},"info":{"ephemeral_id":"b955f134-d927-4036-a625-a5d9d2b4c577","uptime":{"ms":30284},"version":"7.17.3"},"memstats":{"gc_next":24286336,"memory_alloc":18659104,"memory_sys":33637384,"memory_total":77060296,"rss":129179648},"runtime":{"goroutines":79}},"filebeat":{"events":{"added":45,"done":45},"harvester":{"open_files":3,"running":3,"started":3}},"libbeat":{"config":{"module":{"running":0}},"output":{"events":{"acked":17,"active":0,"batches":6,"total":17},"read":{"bytes":6160},"type":"elasticsearch","write":{"bytes":29800}},"pipeline":{"clients":1,"events":{"active":0,"filtered":28,"published":17,"retry":9,"total":45},"queue":{"acked":17,"max_events":4096}}},"registrar":{"states":{"current":35,"update":45},"writes":{"success":34,"total":34}},"system":{"cpu":{"cores":4},"load":{"1":0.18,"15":0.22,"5":0.25,"norm":{"1":0.045,"15":0.055,"5":0.0625}}}}}}
20
2022-08-19T01:21:25.592Z	INFO	[monitoring]	log/log.go:184	Non-zero metrics in the last 30s	{"monitoring": {"metrics": {"beat":{"cgroup":{"cpu":{"stats":{"periods":38,"throttled":{"ns":1648409142,"periods":13}}},"cpuacct":{"total":{"ns":623128294}},"memory":{"mem":{"usage":{"bytes":1261568}}}},"cpu":{"system":{"ticks":80,"time":{"ms":24}},"total":{"ticks":300,"time":{"ms":24},"value":300},"user":{"ticks":220}},"handles":{"limit":{"hard":1048576,"soft":1048576},"open":15},"info":{"ephemeral_id":"b955f134-d927-4036-a625-a5d9d2b4c577","uptime":{"ms":60285},"version":"7.17.3"},"memstats":{"gc_next":24286336,"memory_alloc":21928344,"memory_sys":4456448,"memory_total":80329536,"rss":130711552},"runtime":{"goroutines":79}},"filebeat":{"events":{"added":4,"done":4},"harvester":{"open_files":3,"running":3}},"libbeat":{"config":{"module":{"running":0}},"output":{"events":{"acked":4,"active":0,"batches":4,"total":4},"read":{"bytes":2189},"write":{"bytes":11452}},"pipeline":{"clients":1,"events":{"active":0,"published":4,"total":4},"queue":{"acked":4}}},"registrar":{"states":{"current":35,"update":4},"writes":{"success":4,"total":4}},"system":{"load":{"1":0.11,"15":0.21,"5":0.23,"norm":{"1":0.0275,"15":0.0525,"5":0.0575}}}}}}
19
2022-08-19T01:21:55.592Z	INFO	[monitoring]	log/log.go:184	Non-zero metrics in the last 30s	{"monitoring": {"metrics": {"beat":{"cgroup":{"cpu":{"stats":{"periods":33,"throttled":{"ns":1551736293,"periods":12}}},"cpuacct":{"total":{"ns":615573051}},"memory":{"mem":{"usage":{"bytes":1744896}}}},"cpu":{"system":{"ticks":90,"time":{"ms":12}},"total":{"ticks":320,"time":{"ms":25},"value":320},"user":{"ticks":230,"time":{"ms":13}}},"handles":{"limit":{"hard":1048576,"soft":1048576},"open":15},"info":{"ephemeral_id":"b955f134-d927-4036-a625-a5d9d2b4c577","uptime":{"ms":90286},"version":"7.17.3"},"memstats":{"gc_next":23749568,"memory_alloc":12512656,"memory_total":82098904,"rss":131874816},"runtime":{"goroutines":79}},"filebeat":{"events":{"added":2,"done":2},"harvester":{"open_files":3,"running":3}},"libbeat":{"config":{"module":{"running":0}},"output":{"events":{"acked":2,"active":0,"batches":2,"total":2},"read":{"bytes":1094},"write":{"bytes":5726}},"pipeline":{"clients":1,"events":{"active":0,"published":2,"total":2},"queue":{"acked":2}}},"registrar":{"states":{"current":35,"update":2},"writes":{"success":2,"total":2}},"system":{"load":{"1":0.13,"15":0.2,"5":0.22,"norm":{"1":0.0325,"15":0.05,"5":0.055}}}}}}
18
2022-08-19T01:22:25.593Z	INFO	[monitoring]	log/log.go:184	Non-zero metrics in the last 30s	{"monitoring": {"metrics": {"beat":{"cgroup":{"cpu":{"stats":{"periods":37,"throttled":{"ns":2112483487,"periods":13}}},"cpuacct":{"total":{"ns":625012768}},"memory":{"mem":{"usage":{"bytes":667648}}}},"cpu":{"system":{"ticks":110,"time":{"ms":15}},"total":{"ticks":340,"time":{"ms":18},"value":340},"user":{"ticks":230,"time":{"ms":3}}},"handles":{"limit":{"hard":1048576,"soft":1048576},"open":15},"info":{"ephemeral_id":"b955f134-d927-4036-a625-a5d9d2b4c577","uptime":{"ms":120284},"version":"7.17.3"},"memstats":{"gc_next":23749568,"memory_alloc":15019520,"memory_total":84605768,"rss":132141056},"runtime":{"goroutines":79}},"filebeat":{"events":{"added":3,"done":3},"harvester":{"open_files":3,"running":3}},"libbeat":{"config":{"module":{"running":0}},"output":{"events":{"acked":3,"active":0,"batches":3,"total":3},"read":{"bytes":1642},"write":{"bytes":8589}},"pipeline":{"clients":1,"events":{"active":0,"published":3,"total":3},"queue":{"acked":3}}},"registrar":{"states":{"current":35,"update":3},"writes":{"success":3,"total":3}},"system":{"load":{"1":0.08,"15":0.2,"5":0.2,"norm":{"1":0.02,"15":0.05,"5":0.05}}}}}}
17
2022-08-19T01:22:55.592Z	INFO	[monitoring]	log/log.go:184	Non-zero metrics in the last 30s	{"monitoring": {"metrics": {"beat":{"cgroup":{"cpu":{"stats":{"periods":37,"throttled":{"ns":1356922020,"periods":12}}},"cpuacct":{"total":{"ns":607688534}},"memory":{"mem":{"usage":{"bytes":233472}}}},"cpu":{"system":{"ticks":110,"time":{"ms":7}},"total":{"ticks":350,"time":{"ms":18},"value":350},"user":{"ticks":240,"time":{"ms":11}}},"handles":{"limit":{"hard":1048576,"soft":1048576},"open":15},"info":{"ephemeral_id":"b955f134-d927-4036-a625-a5d9d2b4c577","uptime":{"ms":150283},"version":"7.17.3"},"memstats":{"gc_next":23749568,"memory_alloc":17418656,"memory_total":87004904,"rss":132358144},"runtime":{"goroutines":79}},"filebeat":{"events":{"added":3,"done":3},"harvester":{"open_files":3,"running":3}},"libbeat":{"config":{"module":{"running":0}},"output":{"events":{"acked":3,"active":0,"batches":3,"total":3},"read":{"bytes":1640},"write":{"bytes":8589}},"pipeline":{"clients":1,"events":{"active":0,"published":3,"total":3},"queue":{"acked":3}}},"registrar":{"states":{"current":35,"update":3},"writes":{"success":3,"total":3}},"system":{"load":{"1":0.04,"15":0.19,"5":0.18,"norm":{"1":0.01,"15":0.0475,"5":0.045}}}}}}
16
2022-08-19T01:23:25.593Z	INFO	[monitoring]	log/log.go:184	Non-zero metrics in the last 30s	{"monitoring": {"metrics": {"beat":{"cgroup":{"cpu":{"stats":{"periods":47,"throttled":{"ns":1776686610,"periods":13}}},"cpuacct":{"total":{"ns":626207689}},"memory":{"mem":{"usage":{"bytes":1019904}}}},"cpu":{"system":{"ticks":140,"time":{"ms":23}},"total":{"ticks":390,"time":{"ms":27},"value":390},"user":{"ticks":250,"time":{"ms":4}}},"handles":{"limit":{"hard":1048576,"soft":1048576},"open":15},"info":{"ephemeral_id":"b955f134-d927-4036-a625-a5d9d2b4c577","uptime":{"ms":180286},"version":"7.17.3"},"memstats":{"gc_next":24207184,"memory_alloc":22237816,"memory_total":91824064,"rss":133107712},"runtime":{"goroutines":79}},"filebeat":{"events":{"added":8,"done":8},"harvester":{"open_files":3,"running":3}},"libbeat":{"config":{"module":{"running":0}},"output":{"events":{"acked":8,"active":0,"batches":8,"total":8},"read":{"bytes":4330},"write":{"bytes":22702}},"pipeline":{"clients":1,"events":{"active":0,"published":8,"total":8},"queue":{"acked":8}}},"registrar":{"states":{"current":35,"update":8},"writes":{"success":8,"total":8}},"system":{"load":{"1":0.03,"15":0.18,"5":0.16,"norm":{"1":0.0075,"15":0.045,"5":0.04}}}}}}
15
2022-08-19T01:23:55.592Z	INFO	[monitoring]	log/log.go:184	Non-zero metrics in the last 30s	{"monitoring": {"metrics": {"beat":{"cgroup":{"cpu":{"stats":{"periods":35,"throttled":{"ns":2011123706,"periods":12}}},"cpuacct":{"total":{"ns":618946143}},"memory":{"mem":{"usage":{"bytes":339968}}}},"cpu":{"system":{"ticks":160,"time":{"ms":22}},"total":{"ticks":420,"time":{"ms":30},"value":420},"user":{"ticks":260,"time":{"ms":8}}},"handles":{"limit":{"hard":1048576,"soft":1048576},"open":15},"info":{"ephemeral_id":"b955f134-d927-4036-a625-a5d9d2b4c577","uptime":{"ms":210284},"version":"7.17.3"},"memstats":{"gc_next":24371808,"memory_alloc":13824696,"memory_total":93665752,"rss":133107712},"runtime":{"goroutines":79}},"filebeat":{"events":{"added":2,"done":2},"harvester":{"open_files":3,"running":3}},"libbeat":{"config":{"module":{"running":0}},"output":{"events":{"acked":2,"active":0,"batches":2,"total":2},"read":{"bytes":1093},"write":{"bytes":5726}},"pipeline":{"clients":1,"events":{"active":0,"published":2,"total":2},"queue":{"acked":2}}},"registrar":{"states":{"current":35,"update":2},"writes":{"success":2,"total":2}},"system":{"load":{"1":0.01,"15":0.17,"5":0.14,"norm":{"1":0.0025,"15":0.0425,"5":0.035}}}}}}
14
2022-08-19T01:24:05.713Z	INFO	[input.harvester]	log/harvester.go:309	Harvester started for paths: [/var/log/containers/*]	{"input_id": "dfeed8c2-3453-41bd-90a1-da7fd7b26a5e", "source": "/var/log/containers/kube-proxy-tb5dx_kube-system_kube-proxy-c8f7419322880d172a8539cc4245b145e78958fba6d1304ea8c79541c7bb681a.log", "state_id": "native::37753109-66305", "finished": false, "os_id": "37753109-66305", "old_source": "/var/log/containers/kube-proxy-tb5dx_kube-system_kube-proxy-c8f7419322880d172a8539cc4245b145e78958fba6d1304ea8c79541c7bb681a.log", "old_finished": true, "old_os_id": "37753109-66305", "harvester_id": "64e87a67-2e37-44d1-90af-621535ee2602"}
13
2022-08-19T01:24:25.592Z	INFO	[monitoring]	log/log.go:184	Non-zero metrics in the last 30s	{"monitoring": {"metrics": {"beat":{"cgroup":{"cpu":{"stats":{"periods":43,"throttled":{"ns":1779780774,"periods":12}}},"cpuacct":{"total":{"ns":615261401}},"memory":{"mem":{"usage":{"bytes":393216}}}},"cpu":{"system":{"ticks":180,"time":{"ms":17}},"total":{"ticks":440,"time":{"ms":24},"value":440},"user":{"ticks":260,"time":{"ms":7}}},"handles":{"limit":{"hard":1048576,"soft":1048576},"open":16},"info":{"ephemeral_id":"b955f134-d927-4036-a625-a5d9d2b4c577","uptime":{"ms":240286},"version":"7.17.3"},"memstats":{"gc_next":24371808,"memory_alloc":16933256,"memory_total":96774312,"rss":133857280},"runtime":{"goroutines":84}},"filebeat":{"events":{"added":5,"done":5},"harvester":{"open_files":4,"running":4,"started":1}},"libbeat":{"config":{"module":{"running":0}},"output":{"events":{"acked":4,"active":0,"batches":4,"total":4},"read":{"bytes":2182},"write":{"bytes":11359}},"pipeline":{"clients":1,"events":{"active":0,"filtered":1,"published":4,"total":5},"queue":{"acked":4}}},"registrar":{"states":{"current":35,"update":5},"writes":{"success":5,"total":5}},"system":{"load":{"1":0.01,"15":0.17,"5":0.13,"norm":{"1":0.0025,"15":0.0425,"5":0.0325}}}}}}
12
2022-08-19T01:24:55.592Z	INFO	[monitoring]	log/log.go:184	Non-zero metrics in the last 30s	{"monitoring": {"metrics": {"beat":{"cgroup":{"cpu":{"stats":{"periods":36,"throttled":{"ns":1752551130,"periods":13}}},"cpuacct":{"total":{"ns":621369622}},"memory":{"mem":{"usage":{"bytes":262144}}}},"cpu":{"system":{"ticks":190,"time":{"ms":11}},"total":{"ticks":460,"time":{"ms":18},"value":460},"user":{"ticks":270,"time":{"ms":7}}},"handles":{"limit":{"hard":1048576,"soft":1048576},"open":16},"info":{"ephemeral_id":"b955f134-d927-4036-a625-a5d9d2b4c577","uptime":{"ms":270284},"version":"7.17.3"},"memstats":{"gc_next":24371808,"memory_alloc":19334312,"memory_total":99175368,"rss":133857280},"runtime":{"goroutines":84}},"filebeat":{"events":{"added":3,"done":3},"harvester":{"open_files":4,"running":4}},"libbeat":{"config":{"module":{"running":0}},"output":{"events":{"acked":3,"active":0,"batches":3,"total":3},"read":{"bytes":1638},"write":{"bytes":8589}},"pipeline":{"clients":1,"events":{"active":0,"published":3,"total":3},"queue":{"acked":3}}},"registrar":{"states":{"current":35,"update":3},"writes":{"success":3,"total":3}},"system":{"load":{"1":0.08,"15":0.17,"5":0.13,"norm":{"1":0.02,"15":0.0425,"5":0.0325}}}}}}
11
2022-08-19T01:25:25.593Z	INFO	[monitoring]	log/log.go:184	Non-zero metrics in the last 30s	{"monitoring": {"metrics": {"beat":{"cgroup":{"cpu":{"stats":{"periods":39,"throttled":{"ns":1730898239,"periods":11}}},"cpuacct":{"total":{"ns":615605647}},"memory":{"mem":{"usage":{"bytes":229376}}}},"cpu":{"system":{"ticks":200,"time":{"ms":16}},"total":{"ticks":480,"time":{"ms":23},"value":480},"user":{"ticks":280,"time":{"ms":7}}},"handles":{"limit":{"hard":1048576,"soft":1048576},"open":16},"info":{"ephemeral_id":"b955f134-d927-4036-a625-a5d9d2b4c577","uptime":{"ms":300286},"version":"7.17.3"},"memstats":{"gc_next":24371808,"memory_alloc":22564656,"memory_total":102405712,"rss":134115328},"runtime":{"goroutines":84}},"filebeat":{"events":{"added":4,"done":4},"harvester":{"open_files":4,"running":4}},"libbeat":{"config":{"module":{"running":0}},"output":{"events":{"acked":4,"active":0,"batches":4,"total":4},"read":{"bytes":2189},"write":{"bytes":11452}},"pipeline":{"clients":1,"events":{"active":0,"published":4,"total":4},"queue":{"acked":4}}},"registrar":{"states":{"current":35,"update":4},"writes":{"success":4,"total":4}},"system":{"load":{"1":0.05,"15":0.16,"5":0.12,"norm":{"1":0.0125,"15":0.04,"5":0.03}}}}}}
10
2022-08-19T01:25:55.592Z	INFO	[monitoring]	log/log.go:184	Non-zero metrics in the last 30s	{"monitoring": {"metrics": {"beat":{"cgroup":{"cpu":{"stats":{"periods":45,"throttled":{"ns":1844715270,"periods":14}}},"cpuacct":{"total":{"ns":634787709}},"memory":{"mem":{"usage":{"bytes":-106496}}}},"cpu":{"system":{"ticks":220,"time":{"ms":15}},"total":{"ticks":510,"time":{"ms":28},"value":510},"user":{"ticks":290,"time":{"ms":13}}},"handles":{"limit":{"hard":1048576,"soft":1048576},"open":16},"info":{"ephemeral_id":"b955f134-d927-4036-a625-a5d9d2b4c577","uptime":{"ms":330284},"version":"7.17.3"},"memstats":{"gc_next":24429776,"memory_alloc":13823520,"memory_total":104358424,"rss":134115328},"runtime":{"goroutines":84}},"filebeat":{"events":{"added":2,"done":2},"harvester":{"open_files":4,"running":4}},"libbeat":{"config":{"module":{"running":0}},"output":{"events":{"acked":2,"active":0,"batches":2,"total":2},"read":{"bytes":1097},"write":{"bytes":5726}},"pipeline":{"clients":1,"events":{"active":0,"published":2,"total":2},"queue":{"acked":2}}},"registrar":{"states":{"current":35,"update":2},"writes":{"success":2,"total":2}},"system":{"load":{"1":0.03,"15":0.16,"5":0.11,"norm":{"1":0.0075,"15":0.04,"5":0.0275}}}}}}
9
2022-08-19T01:26:25.592Z	INFO	[monitoring]	log/log.go:184	Non-zero metrics in the last 30s	{"monitoring": {"metrics": {"beat":{"cgroup":{"cpu":{"stats":{"periods":43,"throttled":{"ns":1841492677,"periods":13}}},"cpuacct":{"total":{"ns":626716708}},"memory":{"mem":{"usage":{"bytes":139264}}}},"cpu":{"system":{"ticks":230,"time":{"ms":15}},"total":{"ticks":520,"time":{"ms":18},"value":520},"user":{"ticks":290,"time":{"ms":3}}},"handles":{"limit":{"hard":1048576,"soft":1048576},"open":16},"info":{"ephemeral_id":"b955f134-d927-4036-a625-a5d9d2b4c577","uptime":{"ms":360283},"version":"7.17.3"},"memstats":{"gc_next":24429776,"memory_alloc":16218976,"memory_total":106753880,"rss":134115328},"runtime":{"goroutines":84}},"filebeat":{"events":{"added":3,"done":3},"harvester":{"open_files":4,"running":4}},"libbeat":{"config":{"module":{"running":0}},"output":{"events":{"acked":3,"active":0,"batches":3,"total":3},"read":{"bytes":1639},"write":{"bytes":8589}},"pipeline":{"clients":1,"events":{"active":0,"published":3,"total":3},"queue":{"acked":3}}},"registrar":{"states":{"current":35,"update":3},"writes":{"success":3,"total":3}},"system":{"load":{"1":0.02,"15":0.15,"5":0.09,"norm":{"1":0.005,"15":0.0375,"5":0.0225}}}}}}
8
2022-08-19T01:26:55.592Z	INFO	[monitoring]	log/log.go:184	Non-zero metrics in the last 30s	{"monitoring": {"metrics": {"beat":{"cgroup":{"cpu":{"stats":{"periods":45,"throttled":{"ns":1606386650,"periods":11}}},"cpuacct":{"total":{"ns":610399638}},"memory":{"mem":{"usage":{"bytes":49152}}}},"cpu":{"system":{"ticks":250,"time":{"ms":18}},"total":{"ticks":550,"time":{"ms":20},"value":550},"user":{"ticks":300,"time":{"ms":2}}},"handles":{"limit":{"hard":1048576,"soft":1048576},"open":16},"info":{"ephemeral_id":"b955f134-d927-4036-a625-a5d9d2b4c577","uptime":{"ms":390285},"version":"7.17.3"},"memstats":{"gc_next":24429776,"memory_alloc":18887096,"memory_total":109422000,"rss":134115328},"runtime":{"goroutines":84}},"filebeat":{"events":{"added":4,"done":4},"harvester":{"open_files":4,"running":4}},"libbeat":{"config":{"module":{"running":0}},"output":{"events":{"acked":4,"active":0,"batches":3,"total":4},"read":{"bytes":1673},"write":{"bytes":10967}},"pipeline":{"clients":1,"events":{"active":0,"published":4,"total":4},"queue":{"acked":4}}},"registrar":{"states":{"current":35,"update":4},"writes":{"success":3,"total":3}},"system":{"load":{"1":0.11,"15":0.16,"5":0.11,"norm":{"1":0.0275,"15":0.04,"5":0.0275}}}}}}
7
2022-08-19T01:27:25.592Z	INFO	[monitoring]	log/log.go:184	Non-zero metrics in the last 30s	{"monitoring": {"metrics": {"beat":{"cgroup":{"cpu":{"stats":{"periods":42,"throttled":{"ns":1917792907,"periods":12}}},"cpuacct":{"total":{"ns":618175197}},"memory":{"mem":{"usage":{"bytes":-40960}}}},"cpu":{"system":{"ticks":260,"time":{"ms":11}},"total":{"ticks":570,"time":{"ms":20},"value":570},"user":{"ticks":310,"time":{"ms":9}}},"handles":{"limit":{"hard":1048576,"soft":1048576},"open":16},"info":{"ephemeral_id":"b955f134-d927-4036-a625-a5d9d2b4c577","uptime":{"ms":420285},"version":"7.17.3"},"memstats":{"gc_next":24429776,"memory_alloc":21686192,"memory_total":112221096,"rss":134115328},"runtime":{"goroutines":84}},"filebeat":{"events":{"added":4,"done":4},"harvester":{"open_files":4,"running":4}},"libbeat":{"config":{"module":{"running":0}},"output":{"events":{"acked":4,"active":0,"batches":4,"total":4},"read":{"bytes":2190},"write":{"bytes":11452}},"pipeline":{"clients":1,"events":{"active":0,"published":4,"total":4},"queue":{"acked":4}}},"registrar":{"states":{"current":35,"update":4},"writes":{"success":4,"total":4}},"system":{"load":{"1":0.07,"15":0.15,"5":0.1,"norm":{"1":0.0175,"15":0.0375,"5":0.025}}}}}}
6
2022-08-19T01:27:55.592Z	INFO	[monitoring]	log/log.go:184	Non-zero metrics in the last 30s	{"monitoring": {"metrics": {"beat":{"cgroup":{"cpu":{"stats":{"periods":41,"throttled":{"ns":2338412301,"periods":13}}},"cpuacct":{"total":{"ns":627576885}},"memory":{"mem":{"usage":{"bytes":98304}}}},"cpu":{"system":{"ticks":270,"time":{"ms":10}},"total":{"ticks":590,"time":{"ms":29},"value":590},"user":{"ticks":320,"time":{"ms":19}}},"handles":{"limit":{"hard":1048576,"soft":1048576},"open":16},"info":{"ephemeral_id":"b955f134-d927-4036-a625-a5d9d2b4c577","uptime":{"ms":450285},"version":"7.17.3"},"memstats":{"gc_next":24290448,"memory_alloc":13945856,"memory_total":114374552,"rss":134115328},"runtime":{"goroutines":84}},"filebeat":{"events":{"added":2,"done":2},"harvester":{"open_files":4,"running":4}},"libbeat":{"config":{"module":{"running":0}},"output":{"events":{"acked":2,"active":0,"batches":2,"total":2},"read":{"bytes":1093},"write":{"bytes":5726}},"pipeline":{"clients":1,"events":{"active":0,"published":2,"total":2},"queue":{"acked":2}}},"registrar":{"states":{"current":35,"update":2},"writes":{"success":2,"total":2}},"system":{"load":{"1":0.04,"15":0.14,"5":0.09,"norm":{"1":0.01,"15":0.035,"5":0.0225}}}}}}
5
2022-08-19T01:28:25.593Z	INFO	[monitoring]	log/log.go:184	Non-zero metrics in the last 30s	{"monitoring": {"metrics": {"beat":{"cgroup":{"cpu":{"stats":{"periods":41,"throttled":{"ns":1693147588,"periods":12}}},"cpuacct":{"total":{"ns":624799245}},"memory":{"mem":{"usage":{"bytes":237568}}}},"cpu":{"system":{"ticks":280,"time":{"ms":7}},"total":{"ticks":620,"time":{"ms":20},"value":620},"user":{"ticks":340,"time":{"ms":13}}},"handles":{"limit":{"hard":1048576,"soft":1048576},"open":16},"info":{"ephemeral_id":"b955f134-d927-4036-a625-a5d9d2b4c577","uptime":{"ms":480284},"version":"7.17.3"},"memstats":{"gc_next":24290448,"memory_alloc":16313608,"memory_total":116742304,"rss":134115328},"runtime":{"goroutines":84}},"filebeat":{"events":{"added":4,"done":4},"harvester":{"open_files":4,"running":4}},"libbeat":{"config":{"module":{"running":0}},"output":{"events":{"acked":4,"active":0,"batches":3,"total":4},"read":{"bytes":1667},"write":{"bytes":10966}},"pipeline":{"clients":1,"events":{"active":0,"published":4,"total":4},"queue":{"acked":4}}},"registrar":{"states":{"current":35,"update":4},"writes":{"success":3,"total":3}},"system":{"load":{"1":0.02,"15":0.14,"5":0.08,"norm":{"1":0.005,"15":0.035,"5":0.02}}}}}}
4
2022-08-19T01:28:55.592Z	INFO	[monitoring]	log/log.go:184	Non-zero metrics in the last 30s	{"monitoring": {"metrics": {"beat":{"cgroup":{"cpu":{"stats":{"periods":38,"throttled":{"ns":1893888553,"periods":13}}},"cpuacct":{"total":{"ns":626996551}},"memory":{"mem":{"usage":{"bytes":8192}}}},"cpu":{"system":{"ticks":290,"time":{"ms":12}},"total":{"ticks":640,"time":{"ms":20},"value":640},"user":{"ticks":350,"time":{"ms":8}}},"handles":{"limit":{"hard":1048576,"soft":1048576},"open":16},"info":{"ephemeral_id":"b955f134-d927-4036-a625-a5d9d2b4c577","uptime":{"ms":510284},"version":"7.17.3"},"memstats":{"gc_next":24290448,"memory_alloc":18989576,"memory_total":119418272,"rss":134115328},"runtime":{"goroutines":84}},"filebeat":{"events":{"added":3,"done":3},"harvester":{"open_files":4,"running":4}},"libbeat":{"config":{"module":{"running":0}},"output":{"events":{"acked":3,"active":0,"batches":3,"total":3},"read":{"bytes":1641},"write":{"bytes":8589}},"pipeline":{"clients":1,"events":{"active":0,"published":3,"total":3},"queue":{"acked":3}}},"registrar":{"states":{"current":35,"update":3},"writes":{"success":3,"total":3}},"system":{"load":{"1":0.01,"15":0.13,"5":0.07,"norm":{"1":0.0025,"15":0.0325,"5":0.0175}}}}}}
3
2022-08-19T01:29:05.750Z	INFO	[input.harvester]	log/harvester.go:309	Harvester started for paths: [/var/log/containers/*]	{"input_id": "dfeed8c2-3453-41bd-90a1-da7fd7b26a5e", "source": "/var/log/containers/web-backend-deployment-6c84967d8d-k6xr6_zed_app-494ffb87e3921ede2ae920bc811d91c2138db17dd057b66268560dd5cd7ad448.log", "state_id": "native::5774094-66305", "finished": false, "os_id": "5774094-66305", "old_source": "/var/log/containers/web-backend-deployment-6c84967d8d-k6xr6_zed_app-494ffb87e3921ede2ae920bc811d91c2138db17dd057b66268560dd5cd7ad448.log", "old_finished": true, "old_os_id": "5774094-66305", "harvester_id": "21441a3f-3f04-46a3-a75e-d11c182c2625"}
2
2022-08-19T01:29:10.735Z	INFO	[input.harvester]	log/harvester.go:340	File is inactive. Closing because close_inactive of 5m0s reached.	{"input_id": "dfeed8c2-3453-41bd-90a1-da7fd7b26a5e", "source": "/var/log/containers/kube-proxy-tb5dx_kube-system_kube-proxy-c8f7419322880d172a8539cc4245b145e78958fba6d1304ea8c79541c7bb681a.log", "state_id": "native::37753109-66305", "finished": false, "os_id": "37753109-66305", "old_source": "/var/log/containers/kube-proxy-tb5dx_kube-system_kube-proxy-c8f7419322880d172a8539cc4245b145e78958fba6d1304ea8c79541c7bb681a.log", "old_finished": true, "old_os_id": "37753109-66305", "harvester_id": "64e87a67-2e37-44d1-90af-621535ee2602"}
1
2022-08-19T01:29:25.592Z	INFO	[monitoring]	log/log.go:184	Non-zero metrics in the last 30s	{"monitoring": {"metrics": {"beat":{"cgroup":{"cpu":{"stats":{"periods":45,"throttled":{"ns":1880789215,"periods":13}}},"cpuacct":{"total":{"ns":635972040}},"memory":{"mem":{"usage":{"bytes":61440}}}},"cpu":{"system":{"ticks":310,"time":{"ms":20}},"total":{"ticks":670,"time":{"ms":38},"value":670},"user":{"ticks":360,"time":{"ms":18}}},"handles":{"limit":{"hard":1048576,"soft":1048576},"open":16},"info":{"ephemeral_id":"b955f134-d927-4036-a625-a5d9d2b4c577","uptime":{"ms":540284},"version":"7.17.3"},"memstats":{"gc_next":24725872,"memory_alloc":12670848,"memory_total":123239712,"rss":134115328},"runtime":{"goroutines":84}},"filebeat":{"events":{"added":19,"done":19},"harvester":{"closed":1,"open_files":4,"running":4,"started":1}},"libbeat":{"config":{"module":{"running":0}},"output":{"events":{"acked":17,"active":0,"batches":4,"total":17},"read":{"bytes":2307},"write":{"bytes":50005}},"pipeline":{"clients":1,"events":{"active":0,"filtered":2,"published":17,"total":19},"queue":{"acked":17}}},"registrar":{"states":{"current":35,"update":19},"writes":{"success":6,"total":6}},"system":{"load":{"1":0.2,"15":0.14,"5":0.11,"norm":{"1":0.05,"15":0.035,"5":0.0275}}}}}}

and surfacing logs that look like

{
  "_index": "logs-app-web-backend-2022.08.19",
  "_id": "ZnK3s4IBGrhScMFh2dfC",
  "_version": 1,
  "_score": 0,
  "_source": {
    "@timestamp": "2022-08-19T01:29:01.161Z",
    "log": {
      "file": {
        "path": "/var/log/containers/web-backend-deployment-6c84967d8d-k6xr6_zed_app-494ffb87e3921ede2ae920bc811d91c2138db17dd057b66268560dd5cd7ad448.log"
      },
      "offset": 532271
    },
    "stream": "stdout",
    "container": {
      "image": {
        "name": "928657068455.dkr.ecr.us-west-2.amazonaws.com/web-backend:99b346d87d543a854c2f5955972fc7abf78a7570"
      },
      "id": "494ffb87e3921ede2ae920bc811d91c2138db17dd057b66268560dd5cd7ad448",
      "runtime": "docker"
    },
    "host": {
      "name": "filebeat-filebeat-m2f6q"
    },
    "ecs": {
      "version": "1.12.0"
    },
    "message": "{\"email_id\":\"email-live-7f415f34-32c1-4926-8277-f0baa8c90cd0\",\"level\":\"info\",\"message\":\"sent successfully\",\"request_id\":\"request-id-live-2f63d861-3f59-4370-88bf-001caefd2cc3\",\"status_code\":200,\"traceId\":\"6afa1be77646ffb2f44831d1ca8c9891\",\"user_id\":\"user-live-7d42d83f-ec36-490d-bd43-ae2ef5b406d0\"}",
    "input": {
      "type": "container"
    },
    "kubernetes": {
      "namespace_labels": {
        "kubernetes_io/metadata_name": "zed",
        "app_kubernetes_io/instance": "zed-remote-dev",
        "dev_stytch_com/user": "zed"
      },
     (many more kubernetes fields) ...

is that the message json here is stringified an issue? should I be combining this with the decode json step?

Can you provide a couple raw lines from the *.log file.
No the message field looks pretty good to me.

BUT i am a little confused...

Not sure exactly what that means... you tried to just run against a few logs lines?

you show this as your filebeat

filebeat.inputs:
- type: filestream
  paths:
    - /var/log/aus*

but your output shows

    "input": {
      "type": "container"
    },

so I am confused / unclear if you are really using the filebeat.yml you expect?

Sorry, bouncing between different attempts at running

(1)

[quote="Austin_ES_Questions, post:5, topic:312416"]
however, I was able to get top-level json running a test locally:
[/quote]

I was able to run filebeats locally on my laptop, with and without ndjson parser using filestream - and it does behave as expected. When I try using container on my laptop, it didn't log anything at all, oddly

(2)

Hm, it could be that filebeat failed to update and I didn't realize it. I'm trying this:

daemonset:
  enabled: true
  filebeatConfig:
    filebeat.yml: |
      filebeat.inputs:
      - type: filestream
        paths:
          - /var/log/containers/*
        exclude_files:
          - /var/log/containers/filebeat-*
          - /var/log/containers/fluent-bit-*
          - /var/log/containers/logstash-*
          - /var/log/containers/logdna*

        parsers:
          - ndjson:
              target: ""
              overwrite_keys: true
              message_key: message

        processors:
          - add_fields:
              target: meta
              fields:
                name: "austin-test-001"
          - add_kubernetes_metadata:
              host: ${NODE_NAME}
              matchers:
              - logs_path:
                  logs_path: "/var/log/containers/"

      output.elasticsearch:
        hosts: ["https://<host>.us-west-2.aws.found.io:443"]
        username: "elastic"
        password: "<pw>"
        index: "logs-%{[kubernetes.container.name]:unknown}-%{[kubernetes.labels.app_kubernetes_io/name]:unknown}-%{+yyyy.MM.dd}"

      setup.ilm.enabled: false
      setup.template.name: "30-days-default"
      setup.template.pattern: "logs-*-*-*"

and curiously not getting any logs from filebeat at all now

Updated to try and include parsers.container.stream: all, still no logs in ES

daemonset:
  enabled: true
  filebeatConfig:
    filebeat.yml: |
      filebeat.inputs:
      - type: filestream
        id: filebeat-container-all-non-logger-application-logs
        paths:
          - /var/log/containers/*
        exclude_files:
          - /var/log/containers/filebeat-*
          - /var/log/containers/fluent-bit-*
          - /var/log/containers/logstash-*
          - /var/log/containers/logdna*

        parsers:
          - container:
              stream: all
          - ndjson:
              target: ""
              overwrite_keys: true
              message_key: message

        processors:
          - add_fields:
              target: meta
              fields:
                name: "austin-test-003"
          - add_kubernetes_metadata:
              host: ${NODE_NAME}
              matchers:
              - logs_path:
                  logs_path: "/var/log/containers/"

      output.elasticsearch:
        hosts: ["https://<host>.us-west-2.aws.found.io:443"]
        username: "elastic"
        password: "<pw>"
        index: "logs-%{[kubernetes.container.name]:unknown}-%{[kubernetes.labels.app_kubernetes_io/name]:unknown}-%{+yyyy.MM.dd}"

      setup.ilm.enabled: false
      setup.template.name: "30-days-default"
      setup.template.pattern: "logs-*-*-*"

Example filebeat output

34
2022-08-19T03:35:57.473Z	INFO	instance/beat.go:685	Home path: [/usr/share/filebeat] Config path: [/usr/share/filebeat] Data path: [/usr/share/filebeat/data] Logs path: [/usr/share/filebeat/logs] Hostfs Path: [/]
33
2022-08-19T03:35:57.473Z	INFO	instance/beat.go:693	Beat ID: 360f943c-f2f4-41a4-a410-00db1745adcf
32
2022-08-19T03:35:57.474Z	INFO	[api]	api/server.go:62	Starting stats endpoint
31
2022-08-19T03:35:57.474Z	INFO	[api]	api/server.go:64	Metrics endpoint listening on: 127.0.0.1:5066 (configured: localhost)
30
2022-08-19T03:35:57.474Z	INFO	[seccomp]	seccomp/seccomp.go:124	Syscall filter successfully installed
29
2022-08-19T03:35:57.474Z	INFO	[beat]	instance/beat.go:1039	Beat info	{"system_info": {"beat": {"path": {"config": "/usr/share/filebeat", "data": "/usr/share/filebeat/data", "home": "/usr/share/filebeat", "logs": "/usr/share/filebeat/logs"}, "type": "filebeat", "uuid": "360f943c-f2f4-41a4-a410-00db1745adcf"}}}
28
2022-08-19T03:35:57.474Z	INFO	[beat]	instance/beat.go:1048	Build info	{"system_info": {"build": {"commit": "1993ee88a11cb34f61a1fb45c7c3cf50533682cb", "libbeat": "7.17.3", "time": "2022-04-19T09:27:20.000Z", "version": "7.17.3"}}}
27
2022-08-19T03:35:57.474Z	INFO	[beat]	instance/beat.go:1051	Go runtime info	{"system_info": {"go": {"os":"linux","arch":"amd64","max_procs":16,"version":"go1.17.8"}}}
26
2022-08-19T03:35:57.474Z	INFO	[beat]	instance/beat.go:1055	Host info	{"system_info": {"host": {"architecture":"x86_64","boot_time":"2022-08-07T22:40:31Z","containerized":true,"name":"filebeat-filebeat-zbbwf","ip":["127.0.0.1/8","10.0.82.134/32"],"kernel_version":"5.4.204-113.362.amzn2.x86_64","mac":["d6:14:ef:44:40:5e"],"os":{"type":"linux","family":"debian","platform":"ubuntu","name":"Ubuntu","version":"20.04.4 LTS (Focal Fossa)","major":20,"minor":4,"patch":4,"codename":"focal"},"timezone":"UTC","timezone_offset_sec":0}}}
25
2022-08-19T03:35:57.559Z	INFO	[beat]	instance/beat.go:1084	Process info	{"system_info": {"process": {"capabilities": {"inheritable":null,"permitted":["chown","dac_override","fowner","fsetid","kill","setgid","setuid","setpcap","net_bind_service","net_raw","sys_chroot","mknod","audit_write","setfcap"],"effective":["chown","dac_override","fowner","fsetid","kill","setgid","setuid","setpcap","net_bind_service","net_raw","sys_chroot","mknod","audit_write","setfcap"],"bounding":["chown","dac_override","fowner","fsetid","kill","setgid","setuid","setpcap","net_bind_service","net_raw","sys_chroot","mknod","audit_write","setfcap"],"ambient":null}, "cwd": "/usr/share/filebeat", "exe": "/usr/share/filebeat/filebeat", "name": "filebeat", "pid": 7, "ppid": 1, "seccomp": {"mode":"filter","no_new_privs":true}, "start_time": "2022-08-19T03:35:56.290Z"}}}
24
2022-08-19T03:35:57.559Z	INFO	instance/beat.go:328	Setup Beat: filebeat; Version: 7.17.3
23
2022-08-19T03:35:57.560Z	INFO	[esclientleg]	eslegclient/connection.go:105	elasticsearch url: https://9884b8f3246148afbfcbe768ab9374cc.us-west-2.aws.found.io:443
22
2022-08-19T03:35:57.560Z	INFO	[publisher]	pipeline/module.go:113	Beat name: filebeat-filebeat-zbbwf
21
2022-08-19T03:35:57.561Z	INFO	[monitoring]	log/log.go:142	Starting metrics logging every 30s
20
2022-08-19T03:35:57.561Z	INFO	instance/beat.go:492	filebeat start running.
19
2022-08-19T03:35:57.561Z	INFO	memlog/store.go:119	Loading data file of '/usr/share/filebeat/data/registry/filebeat' succeeded. Active transaction id=1499934
18
2022-08-19T03:35:58.292Z	INFO	memlog/store.go:124	Finished loading transaction log file for '/usr/share/filebeat/data/registry/filebeat'. Active transaction id=1522864
17
2022-08-19T03:35:58.292Z	INFO	[registrar]	registrar/registrar.go:109	States Loaded from registrar: 44
16
2022-08-19T03:35:58.292Z	INFO	[crawler]	beater/crawler.go:71	Loading Inputs: 1
15
2022-08-19T03:35:58.292Z	INFO	[crawler]	beater/crawler.go:117	starting input, keys present on the config: [filebeat.inputs.0.exclude_files.0 filebeat.inputs.0.exclude_files.1 filebeat.inputs.0.exclude_files.2 filebeat.inputs.0.exclude_files.3 filebeat.inputs.0.id filebeat.inputs.0.parsers.0.container.stream filebeat.inputs.0.parsers.1.ndjson.message_key filebeat.inputs.0.parsers.1.ndjson.overwrite_keys filebeat.inputs.0.parsers.1.ndjson.target filebeat.inputs.0.paths.0 filebeat.inputs.0.processors.0.add_fields.fields.name filebeat.inputs.0.processors.0.add_fields.target filebeat.inputs.0.processors.1.add_kubernetes_metadata.host filebeat.inputs.0.processors.1.add_kubernetes_metadata.matchers.0.logs_path.logs_path filebeat.inputs.0.type]
14
2022-08-19T03:35:58.293Z	INFO	[crawler]	beater/crawler.go:148	Starting input (ID: 17875414307430401863)
13
2022-08-19T03:35:58.293Z	INFO	[crawler]	beater/crawler.go:106	Loading and starting Inputs completed. Enabled inputs: 1
12
2022-08-19T03:35:58.293Z	INFO	[input.filestream]	compat/compat.go:111	Input filestream starting	{"id": "filebeat-container-all-non-logger-application-logs"}
11
2022-08-19T03:35:58.293Z	INFO	[file_watcher]	filestream/fswatch.go:138	Start next scan
10
2022-08-19T03:35:58.359Z	INFO	add_kubernetes_metadata/kubernetes.go:72	add_kubernetes_metadata: kubernetes env detected, with version: v1.21.13-eks-84b4fe6
9
2022-08-19T03:35:58.359Z	INFO	[kubernetes]	kubernetes/util.go:122	kubernetes: Using node ip-10-0-94-61.us-west-2.compute.internal provided in the config	{"libbeat.processor": "add_kubernetes_metadata"}
8
2022-08-19T03:36:08.293Z	INFO	[file_watcher]	filestream/fswatch.go:138	Start next scan
7
2022-08-19T03:36:18.294Z	INFO	[file_watcher]	filestream/fswatch.go:138	Start next scan
6
2022-08-19T03:36:27.564Z	INFO	[monitoring]	log/log.go:184	Non-zero metrics in the last 30s	{"monitoring": {"metrics": {"beat":{"cgroup":{"cpu":{"cfs":{"period":{"us":100000},"quota":{"us":40000}},"id":"/","stats":{"periods":47,"throttled":{"ns":4590193017,"periods":21}}},"cpuacct":{"id":"/","total":{"ns":968354861}},"memory":{"id":"/","mem":{"limit":{"bytes":209715200},"usage":{"bytes":59146240}}}},"cpu":{"system":{"ticks":60,"time":{"ms":60}},"total":{"ticks":540,"time":{"ms":542},"value":540},"user":{"ticks":480,"time":{"ms":482}}},"handles":{"limit":{"hard":1048576,"soft":1048576},"open":11},"info":{"ephemeral_id":"5196eb54-4978-422f-bc3f-a7b6db400ddd","uptime":{"ms":30295},"version":"7.17.3"},"memstats":{"gc_next":22916048,"memory_alloc":14550528,"memory_sys":40911880,"memory_total":119867128,"rss":128077824},"runtime":{"goroutines":72}},"filebeat":{"harvester":{"open_files":0,"running":0}},"libbeat":{"config":{"module":{"running":0}},"output":{"events":{"active":0},"type":"elasticsearch"},"pipeline":{"clients":0,"events":{"active":0},"queue":{"max_events":4096}}},"registrar":{"states":{"current":0}},"system":{"cpu":{"cores":16},"load":{"1":0.48,"15":0.49,"5":0.66,"norm":{"1":0.03,"15":0.0306,"5":0.0413}}}}}}
5
2022-08-19T03:36:28.293Z	INFO	[file_watcher]	filestream/fswatch.go:138	Start next scan
4
2022-08-19T03:36:38.294Z	INFO	[file_watcher]	filestream/fswatch.go:138	Start next scan
3
2022-08-19T03:36:48.293Z	INFO	[file_watcher]	filestream/fswatch.go:138	Start next scan
2
2022-08-19T03:36:57.564Z	INFO	[monitoring]	log/log.go:184	Non-zero metrics in the last 30s	{"monitoring": {"metrics": {"beat":{"cgroup":{"cpu":{"stats":{"periods":34,"throttled":{"ns":3187500794,"periods":12}}},"cpuacct":{"total":{"ns":633471481}},"memory":{"mem":{"usage":{"bytes":2162688}}}},"cpu":{"system":{"ticks":60,"time":{"ms":7}},"total":{"ticks":540,"time":{"ms":8},"value":540},"user":{"ticks":480,"time":{"ms":1}}},"handles":{"limit":{"hard":1048576,"soft":1048576},"open":11},"info":{"ephemeral_id":"5196eb54-4978-422f-bc3f-a7b6db400ddd","uptime":{"ms":60294},"version":"7.17.3"},"memstats":{"gc_next":22916048,"memory_alloc":15176456,"memory_total":120493056,"rss":128077824},"runtime":{"goroutines":72}},"filebeat":{"harvester":{"open_files":0,"running":0}},"libbeat":{"config":{"module":{"running":0}},"output":{"events":{"active":0}},"pipeline":{"clients":0,"events":{"active":0}}},"registrar":{"states":{"current":0}},"system":{"load":{"1":0.51,"15":0.48,"5":0.65,"norm":{"1":0.0319,"15":0.03,"5":0.0406}}}}}}
1
2022-08-19T03:36:58.294Z	INFO	[file_watcher]	filestream/fswatch.go:138	Start next scan

I meant the container logs ... not the filebeat log... but I will look at them too... but the fact you got to them and saw them fine.

OK are you running kubernetes? Where are you running filebeat and your containers? ... when you say you are running it in a container?

If you are running in k8s why are you not following the running filebeat in K8s

I am lost a bit... there some piece of the puzzle missing?

Thanks for the link - I will try a flavor of that

We are running on kubernetes. What we did is:

> helm repo add elastic https://helm.elastic.co
> helm show values elastic/filebeat > values.yaml

then created for our specific cluster

daemonset:
  enabled: true
  filebeatConfig:
    filebeat.yml: |
      ...

merging like

helm template elastic/filebeat --version 7.17.3 --name-template filebeat --namespace filebeat -f upstream/values.yaml -f upstream/stytch-dev-xvouj9zs-values.yaml > manifests/stytch-dev-xvouj9zs/manifests.yaml

and running this manifest file as a daemonset using Argo - I am not manually running kubernetes commands, so some of the k8s specific things are over my head

I will try the "Parsing json logs" section of that guide, but I don't have nicely named containers like

          - condition:
              contains:
                kubernetes.container.name: "json-logging"

the configuration of the rest of the applications is outside of my control. My goal is to write both json logs and non-json logs from different applications to different indices on the same elastic cluster, and to gracefully handle when a logfile contains both json and non-json data

one clarifying question - is filebeat.autodiscover an alternative to filebeat.inputs? or do you do both

I am also not seeing any output with

daemonset:
  enabled: true
  filebeatConfig:
    filebeat.yml: |
      filebeat.autodiscover:
        providers:
            - type: kubernetes
              node: ${NODE_NAME}
              templates:
                - condition:
                    contains:
                      kubernetes.container.name: "app"
                  config:
                    - type: container
                      paths:
                        - "/var/log/containers/*.log"
                      exclude_paths:
                        - "/var/log/containers/filebeat*"
                        - "/var/log/containers/log*"
                        - "/var/log/containers/fluent*"
                      json.keys_under_root: true
                      json.add_error_key: true
                      json.message_key: message

      output.elasticsearch:
        ...

Unfortunately had to investigate further

Additional question: is there a reference on the condition logic? I only see references to equal , contains, and not.contains

I am not a helms operator, I am a kubectl operator (barely :wink: ) and only seeing snippets it is hard for me to tell, perhaps someone else can do better.

You don't need that if everything is writing json that is just showing you how you could do conditional if you had json and non-json logs

Yes, one or the other. You use typically us autodiscover when you want to use meta data to conditionally apply modules or logic etc .. etc.. (actually you can just use autodiscover generic too... it is pretty cool but easy to mess up / slight config errors)

If you have all you containers logging in json I would just start with the input

----
apiVersion: v1
kind: ConfigMap
metadata:
  name: filebeat-config
  namespace: kube-system
  labels:
    k8s-app: filebeat
data:
  filebeat.yml: |-
    filebeat.inputs:
    - type: container
      paths:
        - /var/log/containers/*.log
      json.keys_under_root: true
      json.add_error_key: true
      json.message_key: message
      processors:
        - add_kubernetes_metadata:
            host: ${NODE_NAME}
            matchers:
            - logs_path:
                logs_path: "/var/log/containers/"

....

I think (I don't have anything to test at this moment) the equivalent autodiscover would look like this... since you have no condition

----
apiVersion: v1
kind: ConfigMap
metadata:
  name: filebeat-config
  namespace: kube-system
  labels:
    k8s-app: filebeat
data:
  filebeat.yml: |-
    filebeat.autodiscover:
      providers:
          - type: kubernetes
            node: ${NODE_NAME}
            templates:
                config:
                  - type: container
                    paths:
                      - "/var/log/containers/*-${data.kubernetes.container.id}.log"
                    json.keys_under_root: true
                    json.add_error_key: true
                    json.message_key: message
...

Hi Stephen,

We've tried a few permutations of using filestream input to no success. Some issues:

  • the files we had been reading with container input were symlinks, and I don't think that works with filestream
  • the files are on a mounted volume, and the docs seem to imply this is not supported for filestream, with various warnings about changes in inode and deviceids

I'm left with the conclusion that:

  1. We should not use filestream input for our kubernetes use case because filebeat will not be able to read and manage the log files correctly
  2. container input does not work well with our json log lines, it does not support the same parsers concept
  3. autodiscover is too magical and is even further removed than container w.r.t. managing individual log lines

So I'd like to just back up and ask the overall question - what is the best way to load hetereogenuous log lines into Elasticsearch?

We have logs in formats like:

  1. raw json
  2. raw text
  3. stringified json
{"log":"{\"action\":\"GetSuccessfulEvents\",\"time\":\"2022-08-22T22:06:47.303Z\"}\n","stream":"stdout","time":"2022-08-22T22:06:47.304190992Z"}

Apologies you are having issues...

I will need to look at the filestream symlink thing.

Ok back to basics...

If you can use container and you are getting the data then we use ingest pipelines to work with the data (why there is no json parser... is beyond me... I will need to ask)

  1. raw json
  2. raw text
  3. stringified json

Question?

First what is the difference between 1 and 3 ... does 1 mean Pretty JSON / Multiline? Most Logs files are not Pretty JSON...

Can you show me an example of each?

Do you have any indicators / fields / tags to determine the type? If not OK but will be cleaner with something to filter on.

Show me a sample of each and I will show you an ingest pipeline that will work and how to set it in filebeat.

It looks like that but for our customers that use lots of K8s annotations etc... it can be magical, because actually it does the exact opposite allows fine grained control but it is not easy :slight_smile:

Thanks - I think I see where the magic comes in as I am also running into field mapping errors on the ES side when I do send data in container mode :thinking:

On our side I think part of our wounds are self-inflicted from writing different kinds of logs to the same output stream, where we'll have both

{"log":"2022-08-17 15:10:56 UTC | CORE | INFO | (pkg/util/log/log.go:610 in func1) | runtime: final GOMAXPROCS value is: 2\n","stream":"stdout","time":"2022-08-17T15:10:56.147098957Z"}

and

{"log":"{\"level\":\"info\",\"ts\":\"2022-08-17T15:10:24.666Z\",\"caller\":\"entrypoint.sh\",\"msg\":\"Install CNI binary..\"}\n","stream":"stdout","time":"2022-08-17T15:10:24.667421486Z"}

in the same output file. I am not sure if/how we can fix it, albeit in my tests outside of k8s (from my laptop), I'm able to write to a file like

{"log":"{\"action\":\"GetSuccessfulEvents6\"}\n","stream":"stdout"}
{"log":"INFO6"}
test6

and output

{
  "@timestamp": "2022-08-23T01:41:58.348Z",
  "@metadata": {
    "beat": "filebeat",
    "type": "_doc",
    "version": "8.3.3"
  },
  "input": {
    "type": "filestream"
  },
  "host": {
    "name": "Austins-MacBook-Pro-2.local"
  },
  "agent": {
    "version": "8.3.3",
    "ephemeral_id": "3a009b5a-5382-4e38-a266-3bcb7b7ee716",
    "id": "bcaa4287-61ea-479c-bfc8-e2d67f3e34de",
    "name": "Austins-MacBook-Pro-2.local",
    "type": "filebeat"
  },
  "ecs": {
    "version": "8.0.0"
  },
  "log": {
    "offset": 363,
    "file": {
      "path": "/var/logs/pods/account-manager_account-manager-deployment-858479f95b-jh6v8_9f7d2bb5-85ef-48f7-b64f-c25175cc25de/account-manager/1.log"
    }
  },
  "data": {
    "action": "GetSuccessfulEvents6"
  },
  "message": "{\"action\":\"GetSuccessfulEvents6\"}\n"
}
{
  "@timestamp": "2022-08-23T01:41:58.348Z",
  "@metadata": {
    "beat": "filebeat",
    "type": "_doc",
    "version": "8.3.3"
  },
  "agent": {
    "id": "bcaa4287-61ea-479c-bfc8-e2d67f3e34de",
    "name": "Austins-MacBook-Pro-2.local",
    "type": "filebeat",
    "version": "8.3.3",
    "ephemeral_id": "3a009b5a-5382-4e38-a266-3bcb7b7ee716"
  },
  "ecs": {
    "version": "8.0.0"
  },
  "log": {
    "file": {
      "path": "/var/logs/pods/account-manager_account-manager-deployment-858479f95b-jh6v8_9f7d2bb5-85ef-48f7-b64f-c25175cc25de/account-manager/1.log"
    },
    "offset": 431
  },
  "data": {
    "log": "INFO6"
  },
  "message": "INFO6",
  "input": {
    "type": "filestream"
  },
  "error": {
    "data": "INFO6",
    "field": "data.log",
    "message": "parsing input as JSON: invalid character 'I' looking for beginning of value"
  },
  "host": {
    "name": "Austins-MacBook-Pro-2.local"
  }
}
{
  "@timestamp": "2022-08-23T01:41:58.348Z",
  "@metadata": {
    "beat": "filebeat",
    "type": "_doc",
    "version": "8.3.3"
  },
  "log": {
    "offset": 447,
    "file": {
      "path": "/var/logs/pods/account-manager_account-manager-deployment-858479f95b-jh6v8_9f7d2bb5-85ef-48f7-b64f-c25175cc25de/account-manager/1.log"
    }
  },
  "message": "test6",
  "input": {
    "type": "filestream"
  },
  "host": {
    "name": "Austins-MacBook-Pro-2.local"
  },
  "agent": {
    "name": "Austins-MacBook-Pro-2.local",
    "type": "filebeat",
    "version": "8.3.3",
    "ephemeral_id": "3a009b5a-5382-4e38-a266-3bcb7b7ee716",
    "id": "bcaa4287-61ea-479c-bfc8-e2d67f3e34de"
  },
  "ecs": {
    "version": "8.0.0"
  }
}

and while not perfect, it does reasonably load into ES as well:

I'm not sure how to get a nice message replacement out of decode_json_fields, but would visit that if it worked on k8s

trying to recreate on kubernetes, if I do

      filebeat.inputs:
      - type: filestream
        id: filestream-json
        paths:
          - /var/log/containers/*
          - /var/log/pods/*
          - /var/log/pods/*/*
          - /var/log/pods/*/*/*
          - /var/log/pods/*/*/*/*

        parsers:
          - ndjson:
              target: "data"
              overwrite_keys: true
              message_key: message

        processors:
          - decode_json_fields:
              fields: ["data.log"]
              target: "data"
              add_error_key: true
              overwrite_keys: true

it doesn't find any of the files on the mounted volume, for example, if I bash into the filebeat pod:

exec -it filebeat-filebeat-258fv /bin/sh

I can list the symlink:

# ls -latrh /var/log/containers/*
lrwxrwxrwx 1 root root  96 Aug 17 15:10 /var/log/containers/kube-proxy-fxsv7_kube-system_kube-proxy-09c3cc4923bffa99cd207a76f2b1234b5caeac63a6861835acc2a046bb7b62.log -> /var/log/pods/kube-system_kube-proxy-fxsv7_eea5374d-e982-1234-8255-dccc40752968/kube-proxy/0.log

and the referenced file:

# ls /var/log/pods/*/*/*

/var/log/pods/kube-system_aws-node-s9cd4_4b62bf53-8d35-4a0f-123a-3c4ed61c4567/aws-node/0.log

but can't make them visible to filestream input - it just doesn't seem to find any permutation of paths input here. After reading through more of the documentation, my concern is that filestream isn't a good solution for a mounted volume

Separately, trying container I do see the data from

filebeat.inputs:
- type: container
  id: austin-local-test
  paths:
    - /var/log/containers/*

output.elasticsearch:
  hosts: ["https://<redacted>.us-west-2.aws.found.io:443"]
  username: "elastic"
  password: "<redacted>"
  index: "logs-austin-testlocal-%{+yyyy.MM.dd}"

setup.ilm.enabled: false
setup.template.name: "30-days-default"
setup.template.pattern: "logs-*-*-*"

albeit currently I am just getting mapping parser errors like

2022-08-22T23:27:12.599Z	WARN	[elasticsearch]	elasticsearch/client.go:414	Cannot index event publisher.Event{Content:beat.Event{Timestamp:time.Date(2022, time.August, 18, 23, 55, 56, 199653707, time.UTC), Meta:null, Fields:{"agent":{"ephemeral_id":"2a6933b5-444e-4c06-a143-9bf65b400a3a","hostname":"filebeat-filebeat-258fv","id":"2584e428-f841-498d-8683-090ace3eb62d","name":"filebeat-filebeat-258fv","type":"filebeat","version":"7.17.3"},"ecs":{"version":"1.12.0"},"host":{"name":"filebeat-filebeat-258fv"},"input":{"type":"container"},"log":{"file":{"path":"/var/log/containers/datadog-njl9m_datadog_agent-9ed6ea3f2f12318cea9fa704e1bed9498d8d2eaf495a0b5a595b12a4ef8f820f.log"},"offset":376573},"message":"2022-08-18 23:55:56 UTC | CORE | INFO | (pkg/serializer/serializer.go:371 in sendMetadata) | Sent metadata payload, size (raw/compressed): 1513/382 bytes.","stream":"stdout"}, Private:file.State{Id:"native::41002687-66305", PrevId:"", Finished:false, Fileinfo:(*os.fileStat)(0xc0000f7520), Source:"/var/log/containers/datadog-njl9m_datadog_agent-9ed6ea3f2f12318cea9fa704e1bed9498d8d2eaf495a0b5a595b12a4ef8f820f.log", Offset:376798, Timestamp:time.Date(2022, time.August, 22, 23, 27, 9, 555202677, time.Local), TTL:-1, Type:"container", Meta:map[string]string(nil), FileStateOS:file.StateOS{Inode:0x271a6bf, Device:0x10301}, IdentifierName:"native"}, TimeSeries:false}, Flags:0x1, Cache:publisher.EventCache{m:common.MapStr(nil)}} (status=400): {"type":"mapper_parsing_exception","reason":"Cannot write to a field alias [agent.hostname]."}, dropping event!

which specifically here I believe because my previous versions of writing to the same template pattern created this field alias agent.hostname, which i'm now setting explicitly? I'm hesitant to continue down this path for container since I've been unable to get it to decode my json data anyway since it doesn't seem to have a parser.ndjson equivalent

I don't have any autodiscover errors handy, but we ran into similar issues as well

Which path would you recommend we take between the three options, and do you have a recommended configuration that you would use to handle the different data formats? I would like to avoid having to hardcode kubernetes namespaces / container names as much as I can so that the system would continue to work as they changed, in addition to the fact that we don't have homogeneous logging as it is. If we truly need to have each file only have one type of formatted log we could try to head down that path, but that presents a different form of risk for us

Alternatively, if the answer is that we have to do a lot of manual configuration/setup on the Elasticsearch side w.r.t. configuring field mappings, types, aliases, etc., that would be good to know as well - we're hoping to avoid having to completely define all the logging schema for this use case but have already become quite wary of the mapping parser exceptions we've encoutered

And thank you for all your advice and help!

@Austin_ES_Questions Thanks for the Info... I will digest more of it..

Apologies ... for getting side tracked with the filestream... that is on me.

Please take this as observation / perhaps a suggestion only.

Retro/Perspective:
Usually if we had talked earlier I would tell folks just getting started with Elastic Stack / Filebeats etc.. to start mostly or completely with OOTB configuration (there is a LOT of non-obvious reasons to the defaults), and see what it does / get to know the stack and then start to make some adjustments.

I often see exactly what you are doing... which is fine... but it tends to be a longer route to what you want.
You want to define your own index names, mappings, ILM etc...etc.. Ok all good and really it should be simpler BUT there is a lot of interdependencies... and then there is ECS which when used properly (and agree not always easily) can be a big value.
You appear to even have created field aliases .. Nice / Cool
Seems like it should be straight forward easy (and Yes I think it should be easier!) but it is not always... and then we end up here :slight_smile:

Its a little hard for me to help fix because I don't know what you have defined in the mappings... alias etc.. etc.. and what is lurking to bite you...

My normal approach is to send the logs to Elasticsearch ...
Using all the defaults including the filebeat-* index naming...
Look at them... see what is not parsed and what do I want to parse.
Add and ingest Pipeline... Parse the most Common
Then Parse the next most common
Repeat until pretty much everything is parsed
Since you have a lot of json a lot of the parsing can be easier other may not need to be parsed.
Then if I want different index naming now that I know what I have to deal with I tackle that next...
Depending on the complexity of the data that can all done fairly quickly..

Path Forward:
If you are willing I can help you get there ... but we will go complete basic to start and then go from there.

filebeat.inputs:
- type: container

take out all of this

#  index: "logs-austin-testlocal-%{+yyyy.MM.dd}"

#setup.ilm.enabled: false
#setup.template.name: "30-days-default"
#setup.template.pattern: "logs-*-*-*"

Clean up and let get started...

I am really still trying to simply get a sample of your logs....
I can't tell if it is 3 types, 5 types or hundreds of types....

I think I see 5 types... is this correct? Are these the 5 types of log lines inside the log files?
(Not what you already see in Elasticsearch)

{"log":"2022-08-17 15:10:56 UTC | CORE | INFO | (pkg/util/log/log.go:610 in func1) | runtime: final GOMAXPROCS value is: 2\n","stream":"stdout","time":"2022-08-17T15:10:56.147098957Z"}
{"log":"{\"level\":\"info\",\"ts\":\"2022-08-17T15:10:24.666Z\",\"caller\":\"entrypoint.sh\",\"msg\":\"Install CNI binary..\"}\n","stream":"stdout","time":"2022-08-17T15:10:24.667421486Z"}
{"log":"{\"action\":\"GetSuccessfulEvents6\"}\n","stream":"stdout"}
{"log":"INFO6"}
test6

I will start on the pipeline based on these...

Those embeded newlines... are no fun ...ts6\"}\n","stream... but I think I got that covered
What is really going to hurt is you have embedded json with a log field that is sometime and straight value and other times an object.. that makes it hard to process... I will work on that... BUT that is one of the first things for consistency.. it is either and object or a string... but you will see how I fix that.

Let me know if this is correct... OR please provide the Set of log lines...

BTW self inflicted inconsistent logs... I see that everywhere it just is... drive to uniformity is great but this is a pretty common situation... overtime if you can edge in that direction that is great... but it is a long term project usually. I say that as someone you ran a large dev and ops shop for years ... it was tough!

So that was a bit tough... the some concrete values some objects takes a little thinking... I had forgot the fix pattern.

But here you go... this simple pipeline basically make a first pass are parsing your logs... it contains a little magic with moving some fields around at the end trying to be more consistent... and fixing the concrete / object mapping issues.

This is the pipeline..
It get rid of the pesky newlines
Then expands all the json and just drops through if not json
THEN it does some check and move the simple log fields under the log_details object.

PUT _ingest/pipeline/discuss-mixed-container
{
  "processors": [
    {
      "gsub": {
        "field": "message",
        "pattern": """\\n""",
        "replacement": ""
      },
      "json": {
        "field": "message",
        "target_field": "message_details",
        "ignore_failure": true
      }
    },
    {
      "json": {
        "field": "message_details.log",
        "target_field": "message_details.log_details",
        "ignore_failure": true
      }
    },
    {
      "rename": {
        "if": "ctx?.message_details != null && ctx?.message_details?.log != null && ctx?.message_details?.log_details == null",
        "field": "message_details.log",
        "target_field": "message_details.log_details.msg"
      }
    }
  ]
}

Then I simulate it with the sample you gave me

POST _ingest/pipeline/discuss-mixed-container/_simulate
{
  "docs": [
    {
      "_source": {
        "message": """{"log":"2022-08-17 15:10:56 UTC | CORE | INFO | (pkg/util/log/log.go:610 in func1) | runtime: final GOMAXPROCS value is: 2\n","stream":"stdout","time":"2022-08-17T15:10:56.147098957Z"}"""
      }
    },
    {
      "_source": {
        "message": """{"log":"{\"level\":\"info\",\"ts\":\"2022-08-17T15:10:24.666Z\",\"caller\":\"entrypoint.sh\",\"msg\":\"Install CNI binary..\"}\n","stream":"stdout","time":"2022-08-17T15:10:24.667421486Z"}"""
      }
    },
    {
      "_source": {
        "message": """{"log":"{\"action\":\"GetSuccessfulEvents6\"}\n","stream":"stdout"}"""
      }
    },
    {
      "_source": {
        "message": """{"log":"INFO6"}"""
      }
    },
    {
      "_source": {
        "message": "test6"
      }
    }
  ]
}

and the results nicely parsed logs

{
  "docs" : [
    {
      "doc" : {
        "_index" : "_index",
        "_type" : "_doc",
        "_id" : "_id",
        "_source" : {
          "message_details" : {
            "stream" : "stdout",
            "log_details" : {
              "msg" : "2022-08-17 15:10:56 UTC | CORE | INFO | (pkg/util/log/log.go:610 in func1) | runtime: final GOMAXPROCS value is: 2"
            },
            "time" : "2022-08-17T15:10:56.147098957Z"
          },
          "message" : """{"log":"2022-08-17 15:10:56 UTC | CORE | INFO | (pkg/util/log/log.go:610 in func1) | runtime: final GOMAXPROCS value is: 2","stream":"stdout","time":"2022-08-17T15:10:56.147098957Z"}"""
        },
        "_ingest" : {
          "timestamp" : "2022-08-23T05:48:28.356906446Z"
        }
      }
    },
    {
      "doc" : {
        "_index" : "_index",
        "_type" : "_doc",
        "_id" : "_id",
        "_source" : {
          "message_details" : {
            "log_details" : {
              "msg" : "Install CNI binary..",
              "caller" : "entrypoint.sh",
              "level" : "info",
              "ts" : "2022-08-17T15:10:24.666Z"
            },
            "time" : "2022-08-17T15:10:24.667421486Z",
            "log" : """{"level":"info","ts":"2022-08-17T15:10:24.666Z","caller":"entrypoint.sh","msg":"Install CNI binary.."}""",
            "stream" : "stdout"
          },
          "message" : """{"log":"{\"level\":\"info\",\"ts\":\"2022-08-17T15:10:24.666Z\",\"caller\":\"entrypoint.sh\",\"msg\":\"Install CNI binary..\"}","stream":"stdout","time":"2022-08-17T15:10:24.667421486Z"}"""
        },
        "_ingest" : {
          "timestamp" : "2022-08-23T05:48:28.356910139Z"
        }
      }
    },
    {
      "doc" : {
        "_index" : "_index",
        "_type" : "_doc",
        "_id" : "_id",
        "_source" : {
          "message_details" : {
            "log" : """{"action":"GetSuccessfulEvents6"}""",
            "stream" : "stdout",
            "log_details" : {
              "action" : "GetSuccessfulEvents6"
            }
          },
          "message" : """{"log":"{\"action\":\"GetSuccessfulEvents6\"}","stream":"stdout"}"""
        },
        "_ingest" : {
          "timestamp" : "2022-08-23T05:48:28.356911767Z"
        }
      }
    },
    {
      "doc" : {
        "_index" : "_index",
        "_type" : "_doc",
        "_id" : "_id",
        "_source" : {
          "message_details" : {
            "log_details" : {
              "msg" : "INFO6"
            }
          },
          "message" : """{"log":"INFO6"}"""
        },
        "_ingest" : {
          "timestamp" : "2022-08-23T05:48:28.356913219Z"
        }
      }
    },
    {
      "doc" : {
        "_index" : "_index",
        "_type" : "_doc",
        "_id" : "_id",
        "_source" : {
          "message" : "test6"
        },
        "_ingest" : {
          "timestamp" : "2022-08-23T05:48:28.356914657Z"
        }
      }
    }
  ]
}

And to get this all you have to do is set the output pipeline

Then just adding the pipeline to your output... not I did not set the index name... just let it use the default.. you get some extra goodies that way.

output.elasticsearch:
  hosts: ["https://<redacted>.us-west-2.aws.found.io:443"]
  username: "elastic"
  password: "<redacted>"
  pipeline: "discuss-mixed-container" 

And here is what it looks like in Discover all nice and parsed... the fields are keyword because of the automatic filebeat dynamic template another nicety of defaults (although you would probably want to set the ts fields etc)... but a short pipeline and the defaults and you got solid logs / parsing and search.

I know it is a lot... digest and lets us know what you think...

I really appreciate the thorough reply - I'll take a look at this configuration and report back, thanks!

I think I am missing something here - it seems to me that a lot of tools (e.g., filebeat + logstash), prefer to write with filebeat-* and logstash-* index names

But from my user's perspective these aren't useful monikers - we want to form index patterns based on useful things to search (and to exclude) concurrently, so we aspire to have patterns like

log-app-frontend-sdk
log-app-frontend-dashboard
log-app-backend-api
log-monitor-rates
log-monitor-heartbeat

that we may have patterns like

log-*
log-app-*
log-monitor-*
log-app-frontend-*
log-app-backend-*

etc.

(1) is this not a common pattern? are we missing out on some default "goodies" doing this?
(2) is it a mistake to configure this in filebeat? should we be configuring it in ES? (e.g., we should configure this as part of the "pipeline" we're defining like discuss-mixed-container

Yes.Yes. Correct those are the very well thought through designed defaults, consistencies, OOTB functionality etc.. etc.. I referred to. By Not Using the defaults you have signed up that you will then do it all manually / yourself.

Of course they do ... but if you took a look at my suggested steps above for the someone new to the stack... renaming indexes is one of the last things I do ... not the first... did you get everything working filebeat-*... if so then we can move on to renaming the indices..... which will take a number of additional step to do it right / consistent.

BTW there is a completely different view of this which many people also use ... one where everything goes into filebeat-* and is simply tagged on the way int, then the simply filter on tags we apply no tag app montitor app-frontend app-backend exact same result... and possible more effecient... but I get the need / perception to have indices named different but its just a filter... there are some legitimate reasons do that for scale... and your users just want it that was is also a legitimate reason. It just requires more work and a much better understanding of Data Management in the Elastic Stack.

So if you get it all working in filebeat-* then
you would

  • Copy the Filebeat-template log type with the appropriate Log Patterns ... or attempt to create your own from scratch.
  • Create 1 or more ILM Policies
  • Make Sure the ILM Policies are correct in each template
  • Make Sure the Write Aliases are correct in each Template (there will be a different write alias for each of the types above)
  • Create Initial Managed index / Bootstrap Write Alias for each index pattern so that the ILM Policies work correct and indices roll over
  • Create the Logic in Filebeat to write to the correct indices (perhaps that is by host so then it is just for host etc)
  • Set the right setting in filebeat to ignore the defaults

You will need to understand these topics...

You can do a most of this through the UI.. with the exception of creating the initial managed index see here

Again all this is doable... takes me about an hour, it will take you a while to get the first one correct then, not trying to scare you away... but if you want to do it right that is what it will take.

The ingest pipeline resides in elasticsearch it is called by configuring the pipeline setting in filebeat... as a document is sent by filebeat the pipeline setting says apply this pipeline / i.e. process the document before actually writing it to the index.