Dead letter queue encoding issues / binary format

Environment: ELK Stack 6.2.3
Version: Logstash 6.2.3
Docker image: docker.elastic.co/logstash/logstash-oss:6.2.3

I have this stack running: filebeat -> logstash -> elasticsearch

I enabled the DLQ after noticing that some logs were not indexed in Elasticsearch.
And I see this error in the logstash log:

[WARN ][logstash.outputs.elasticsearch] Could not index event to Elasticsearch. {:status=>400, :action=>["index", {:_id=>nil, :_index=>"filebeat-2018.08.29", :_type=>"doc", :_routing=>nil}, #<LogStash::Event:0x79bcd0d0>], :response=>{"index"=>{"_index"=>"filebeat-2018.08.29", "_type"=>"doc", "_id"=>"hspqg2UB0_AzXh3zp3Vz", "status"=>400, "error"=>{"type"=>"mapper_parsing_exception", "reason"=>"object mapping for [host] tried to parse field [host] as object, but found a concrete value"}}}}

The issue that I have is that the files generated by the DLQ in /usr/share/logstash/data/dead_letter_queue/main/ are binary data with a broken format:
$ file -i 1.log 1.log: application/octet-stream; charset=binary

Is it normal for logstash DLQ to output logs in binary or maybe its corrupting the data from the filebeat input in the grok filter stage?

This is a sample line from that file:
1c^@^@^E^UM-^?M-^?M-^?M-^?[q^EV^@^@^@^X2018-08-28T21:21:01.473Z^@^@^BM-]M-^_qjava.util.HashMapM-?dDATAM-^_x^Yorg.logstash.ConvertedMapM-?gmessageM-^_torg.jruby.RubyStringX"Content-Length: 1039M-^?dtagsM-^_x^Zorg.logstash.ConvertedListM-^_M-^_torg.jruby.RubyStringX^_beats_input_codec_plain_appliedM-^?M-^_torg.jruby.RubyStringQ_grokparsefailureM-^?M-^?M-^?h@versiona1fsourceM-^_torg.jruby.RubyStringX-/home/ubuntu/default/storage/logs/laravel.logM-^?dhostM-^_torg.jruby.RubyStringPip-172-27-19-197M-^?foffset^[^@^@^@^@&M-Uo9jprospectorM-^_x^Yorg.logstash.ConvertedMapM-?dtypeM-^_torg.jruby.RubyStringClogM-^?M-^?M-^?j@timestampM-^_vorg.logstash.Timestampx^X2018-08-28T21:20:29.205ZM-^?M-^?M-^?dMETAM-^_x^Yorg.logstash.ConvertedMapM-?jip_addressM-^_torg.jruby.RubyStringL10.42.175.22M-^?dtypeM-^_torg.jruby.RubyStringCdocM-^?dbeatM-^_torg.jruby.RubyStringHfilebeatM-^?gversionM-^_torg.jruby.RubyStringE6.2.2M-^?M-^?M-^?M-^?M-^?^@^@^@^Melasticsearch^@^@^@@0830cb3145455a04e7327acad691bdec08c1df201bbc5ea4eff5622a07eff225^@^@^AM-?Could not index event to Elasticsearch. status: 400, action: ["index", {:_id=>nil, :_index=>"filebeat-2018.08.28", :_type=>"doc", :_routing=>nil}, #<LogStash::Event:0x3219e06d>], response: {"index"=>{"_index"=>"filebeat-2018.08.28", "_type"=>"doc", "_id"=>"5bRpgmUBgASq_Qh8OsoZ", "status"=>400, "error"=>{"type"=>"mapper_parsing_exception", "reason"=>"object mapping for [host] tried to parse field [host] as object, but found a concrete value"}}}

Here is my logstash.conf:

    input {
      beats {
              port => 5044
              ssl => true
              ssl_certificate_authorities => ["/etc/ca.crt"]
              ssl_certificate => "/etc/server.crt"
              ssl_key => "/etc/server.key"
              ssl_verify_mode => "force_peer"
              cipher_suites => [ 'TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256' ]
      }
    }

    ## Add your filters / logstash plugins configuration here

    filter {
    #remove beats info
      mutate {
              remove_field => [ "beat" ]
      }

      grok {
              match => { "message" => ["^\<%{NUMBER:priority}\>%{SYSLOGTIMESTAMP:ingestionDate}%{DATA:microserviceName}\[%{NUMBER:processNumber}\]%{GREEDYDATA:microserviceMessage}",
              	"^%{SYSLOGTIMESTAMP:ingestionDate}%{DATA:hostname}laravel\[%{NUMBER:processNumber}\]:%{DATA:environment}\.%{DATA:result}:%{GREEDYDATA:microserviceMessage}",
              	"^\[%{TIMESTAMP_ISO8601:ingestionDate}\]%{DATA:environment}\.%{DATA:result}:%{DATA:country}-%{GREEDYDATA:microserviceMessage}",
              	"^\[%{TIMESTAMP_ISO8601:ingestionDate}\]%{DATA:environment}\.%{DATA:result}:%{GREEDYDATA:microserviceMessage}"
              ]}
              
              #break_on_match => true
              #Remove unwanted data
              remove_field => [ "message", "host", "prospector.type" ]
              remove_tag => [ "beats_input_codec_plain_applied" ]

      }

      date {
              match => [ "[system][syslog][timestamp]", "MMM  d HH:mm:ss", "MMM dd HH:mm:ss" ]
      }
    }


    output {
      elasticsearch {
              user => logstash
              password => logstash
              hosts => "elasticsearch:9200"
              manage_template => false
              #index => "%{[@metadata][beat]}-%{[@metadata][version]}-%{+YYYY.MM.dd}"
              index => "%{[@metadata][beat]}-%{+YYYY.MM.dd}"
            }
     
    }

I don't think has anything to do with the DLQ. Google the error message "object mapping for [host] tried to parse field [host] as object, but found a concrete value" and you should find useful threads. IIRC it's an incompatible change made in Filebeat 6.2.

Hi, I found some posts of users with the same mapping issue.
They fixed it by removing the [host] field which is something I'm doing in the logstash filter:

remove_field => [ "message", "host", "prospector.type" ]

I don't have control of the filebeat part of the stack, I'm suspicious that some logs are sent from different versions of filebeat (not 6.2.3, but 6.3 or 6.2).

Is it possible that using a different version of logstash and filebeat produces this binary logging?

Is it normal for logstash DLQ to output logs in binary

Yes.

I was suspicious of the log format but now I tried with the dlq logstash input and I was able to see the events correctly. Also by looking at the events I found the mapping problem.
I have some logs with a string in the host field and some with a nested object.
I'll see if I can convert the string or the nested object to avoid the mapping issue.
Thanks!

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.