Filebeat invalid CRI log format error

Hi,

I am frequently seeing invalid CRI log format ERROR with filebeat. We are running filebeat as deamonset on kubernetes nodes to collect container logs. This Error is causing filebeat to stop sending logs to logstash until filebeat is restarted. Our container logs are in JSON format. I am using filebeat helm chart from elastic repo. Below is my config and sample JSON logs from container. Can someone please help me with this issue?

filebeat version - 7.9.3

filebeatConfig:
  filebeat.yml: |
    filebeat.inputs:
    - type: container
      paths:
        - /var/lib/docker/containers/*/*.log*    
      multiline.pattern: '^\d{4}-\d{2}-\d{2}\s\d{2}:\d{2}:\d{2}.\d{3}\s'
      multiline.negate: true
      multiline.match: after
      multiline.max_lines: 2000
      processors:
      - add_kubernetes_metadata:
          host: ${NODE_NAME}
          matchers:
          - logs_path:
              logs_path: "/var/lib/docker/containers/"
      - drop_fields:
          fields: ["kubernetes.labels.app"]
      - drop_event.when.regexp:
            or:
              - kubernetes.pod.name: "filebeat.*"
    output.file.enabled: false
    output.logstash:
        hosts:
        - logstash-logstash:5044
        loadbalance: true
        compression_level: 3
        worker: 8
        bulk_max_size: 4096
        pipelining: 4
 

sample json logs from container

{"log":"2021-01-28 16:26:49.071 log, headers: {}\n","stream":"stdout","time":"2021-01-28T16:26:49.07392093Z"}

Errors from filebeat

|2021-02-05T00:55:18.267Z|ERROR|[reader_docker_json]|readjson/docker_json.go:204|Parse line error: invalid CRI log format|
|2021-02-05T00:55:18.268Z|ERROR|[reader_docker_json]|readjson/docker_json.go:204|Parse line error: parsing CRI timestamp: parsing time "118,ip\"" as "2006-01-02T15:04:05.999999999Z07:00": cannot parse "ip\"" as "2006"|
|2021-02-05T00:55:18.268Z|ERROR|[reader_docker_json]|readjson/docker_json.go:204|Parse line error: invalid CRI log format|
|2021-02-05T00:55:18.268Z|ERROR|[reader_docker_json]|readjson/docker_json.go:204|Parse line error: parsing CRI timestamp: parsing time """ as "2006-01-02T15:04:05.999999999Z07:00": cannot parse """ as "2006"|
|2021-02-05T01:00:28.553Z|ERROR|[reader_docker_json]|readjson/docker_json.go:204|Parse line error: invalid CRI log format|

Could you please share the logs which lead to this errors? The sample log you show does not seem problematic.

@kvch thank you for your response. After somemore debug this is what I observed,
when logstash pipeline stuck then filebeat is also being stuck while sending logs to that particular logstash pod waiting on ack. Here are the logs from filebeat and also from logstash

And it is only working if I restart either filebeat pod or logstash pod.

I don't see any error message in logstash indicating pipeline hang or something. We are running logstash as a kubernetes service and there are 8 logstash pods when 1 pod is not responsive filebeat sticks to that pod and waiting on ack infinitely. Does filebeat creats any sticky sessions? why is it waiting on that particular logstash pod for ack instead to closing the connection and start sending to another pod which is active?

filebeat logs

DEBUG	[logstash]	logstash/async.go:172	61 events out of 61 events sent to logstash host logstash-logstash:5044. Continue sending

logstash logs

[2021-02-05T07:05:50,780][DEBUG][org.logstash.beats.ConnectionHandler][main] 884460aa: batches pending: true
[2021-02-05T07:05:53,405][DEBUG][org.logstash.beats.ConnectionHandler][main] 4f6b0df8: reader and writer are idle, closing remote connection
[2021-02-05T07:05:53,475][DEBUG][org.logstash.beats.ConnectionHandler][main] 17aaab9e: batches pending: true
[2021-02-05T07:05:54,110][DEBUG][org.logstash.beats.ConnectionHandler][main] 8cff884d: batches pending: true
[2021-02-05T07:06:15,369][DEBUG][org.logstash.beats.ConnectionHandler][main] babac4ef: batches pending: true
[2021-02-05T07:06:21,704][DEBUG][org.logstash.beats.ConnectionHandler][main] b8f4c8f7: batches pending: true
[2021-02-05T07:06:28,794][DEBUG][org.logstash.beats.ConnectionHandler][main] a37e2da3: batches pending: true
[2021-02-05T07:06:29,834][DEBUG][org.logstash.beats.ConnectionHandler][main] 455c7c87: batches pending: true
[2021-02-05T07:06:29,834][DEBUG][org.logstash.beats.ConnectionHandler][main] 455c7c87: batches pending: true
[2021-02-05T07:06:30,794][DEBUG][org.logstash.beats.ConnectionHandler][main] a895e7d3: batches pending: true
[2021-02-05T07:06:36,704][DEBUG][org.logstash.beats.ConnectionHandler][main] b8f4c8f7: batches pending: true
[2021-02-05T07:06:38,476][DEBUG][org.logstash.beats.ConnectionHandler][main] 17aaab9e: batches pending: true
[2021-02-05T07:06:42,113][DEBUG][org.logstash.beats.ConnectionHandler][main] 8cff884d: batches pending: true
[2021-02-05T07:06:50,779][DEBUG][org.logstash.beats.ConnectionHandler][main] b5569bb1: reader and writer are idle, closing remote connection
[2021-02-05T07:06:50,779][DEBUG][org.logstash.beats.ConnectionHandler][main] 6997090c: reader and writer are idle, closing remote connection

@kvch any thoughts regarding this issue?

@kvch any update on this issue?