Filtered Log messages show up as empty fields in Kibana

I have filtered my log message using grok. But when I check Kibana, I find the new fields on the left side of the page, but they are empty. I am also getting the _grokparsefailure tag.

Here's an example of my log message:

[2022-09-28 18:11:25,144] {processor.py:641} INFO - Processing file /opt/airflow/dags/dag_filtered.py for tasks to queue

Here's my logstash config file:

input {
  beats {
    port => 5044
    codec => "line"
  }
}
filter{
  grok {
    match => { "message" => "%{TIMESTAMP_ISO8601:timestamp}]%{DATA:class} %{SPACE}%{LOGLEVEL:loglevel} -%{GREEDYDATA:logMessage}" }
    overwrite => [ "message" ]
  }
  date {
    match => [ "timestamp", "MMM dd yyyy HH:mm:ss", "MMM  d yyyy HH:mm:ss", "ISO8601" ]
    target => "@timestamp"
  }
}
output {
  elasticsearch {
    hosts => ["${IP}:9200"]
    index =>"logss-%{+YYYY.MM.dd}"
  }
}

And here's my filebeat configuration:

filebeat.inputs:
- type: filestream
  id: my-filestream-id
  enabled: true
  paths:
    - /home/ubuntu/logs/**/*.log
filebeat.config.modules:
  path: /etc/filebeat/modules.d/*.yml
  reload.enabled: false
setup.template.settings:
  index.number_of_shards: 1
output.logstash:
  hosts: ["${ip}:5044"]
processors:
  - add_host_metadata:
      when.not.contains.tags: forwarded
  - add_cloud_metadata: ~
  - add_docker_metadata: ~
  - add_kubernetes_metadata: ~
  - drop_fields:
      fields: ["agent", "cloud", "ecs", "host", "input", "tags", "log.offset"]
      ignore_missing: true

When I test my log message and the grok pattern I have on Grok Debugger, it works fine. So what am I missing?

Can you give an example of the [message] field on an event that has a _grokparsefailure tag?

I have shared a couple of them below:

Traceback (most recent call last)
[2022-09-23 22:51:02,857] {logging_mixin.py:115} INFO - [2022-09-23 22:51:02,857] {dag.py:2379} INFO - Sync 1 DAGs 

I am guessing that the error is because they have different patterns? I tried to fix this by having multiple match patterns in my logstash configuration(shared below) but I still get the _groksparse failure in my tag.

 match => { "message" => ["%{TIMESTAMP_ISO8601:timestamp}%{DATA:class} %{SPACE}%{LOGLEVEL:loglevel} -%{GREEDYDATA:logMessage}", "%{TIMESTAMP_ISO8601:timestamp}%{DATA:class} %{SPACE}%{LOGLEVEL:loglevel} -%{GREEDYDATA:logMessage}, execution_date=%{GREEDYDATA:execution_date}, start_date=%{GREEDYDATA:start_date}, end_date=%{GREEDYDATA:end_date}",
   "%{GREEDYDATA:logMessage}" ]}

I do not get a _grokparsefailure for either of those. You might want the first pattern to start with \[%{TIMESTAMP_ISO8601:timestamp}\].

1 Like

Add handling for parse failure to know which a line/data cause an error.
You will have the field: "original" in version ELK 8+, or do not overwrite the "message" field.
Avoid GREEDYDATA, use DATA which is faster.

output {
 if "_grokparsefailure" in [tags] {
    elasticsearch {
    hosts => ["${IP}:9200"]
    index =>"logss-%{+YYYY.MM.dd}"
    }
    # or save in a file 
    file { path => "/path/grokfailure_%{+YYYY-MM-dd}.txt" }
 }
else {
  elasticsearch {
    hosts => ["${IP}:9200"]
    index =>"logss-%{+YYYY.MM.dd}"
  }
 }
}
1 Like

That's strange, because I am getting _grokparsefailure for messages without timestamp or Loglevel, and also empty messages.

How can I handle empty messages and messages without timestamp, like these

message: -------------------------------------------------------------------------------- tags: _grokparsefailure
message: AIRFLOW_CTX_EXECUTION_DATE=2022-09-29T14:04:11.795487+00:00 tags: _grokparsefailure

If message starts with ---
if ([message] =~ /^-{3}/ { ... }
If message starts with [date]
if ([message] =~ /^\[\d{4}-\d{2}-\d{2}/ { ... }

1 Like

If that is one of your pattern it should always match and you should never get a _grokparsefailure tag. That suggests that you are not running with the configuration that you think you are.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.