I am using filebeat (Linux box) to send logs to ELK - logstash., It is not sending UTF-8 (Linux) encoding. Looks like hexa-decimal. Help please.
error message in logstash.log:
{:timestamp=>"2016-01-29T11:47:40.722000-0500", :message=>"Received an event that has a different character encoding than you configured.", :text=>"\u0014\u0006X\u0010E\u001D@B\xCA\u0010\x9D\xED{_=z\xF8r2\x99\x9C\xD1\xD7\xEDaaU\xC4\u0016#B\xBD\x87\x8D\xC2\b5tڨ\xC3o", :expected_charset=>"UTF-8", :level=>:warn}
filebeat -> logstash uses json and therefore requires to encode content into UTF-8 on sender side. This hex-like dump results from raw-content being read which is clearly not UTF-8. What's the encoding of the file you are trying to send?
plain and utf-8 are basically the same when forwarding to logstash. Difference is the utf-8 encoding applies the hex-encodings of unknown characters at read time.
Can you share some content of original log-files with us for testing? Some original file is required. Just copy'n pasting in forum will kinda 'fix' the encoding.
both, filebeat and logstash assume input to be UTF-8 and setting charset to UTF-8 is mostly a simple copy of your input data (masking invalid characters). The encoding must match the file encoding you read from.
looks like some kind of misconfiguration, but without getting some original logs (raw file) reproducing the error I have a hard time seeing what the fault is.
This happens for every single log line? If so, can you send a test mail and share the single log line in a separate file for testing?
prospectors:
# Each - is a prospector. Below are the prospector specific configurations
-
# Paths that should be crawled and fetched. Glob based paths.
# To fetch all ".log" files from a specific level of subdirectories
# /var/log//.log can be used.
# For each file found under this path, a harvester is started.
# Make sure not file is defined twice as this can lead to unexpected behaviour.
paths:
- /var/log/maillog #- c:\programdata\elasticsearch\logs*
# Configure the file encoding for reading files with international characters
# following the W3C recommendation for HTML5 (http://www.w3.org/TR/encoding).
# Some sample encodings:
# plain, utf-8, utf-16be-bom, utf-16be, utf-16le, big5, gb18030, gbk,
# hz-gb-2312, euc-kr, euc-jp, iso-2022-jp, shift-jis, ...
encoding: utf-8
I get some output in debug mode, but the message is "message" => "3\xB2\xE1m\u000F\xE0\x92J,"
If the output to console works it looks more like a problem on the logstash side. Do you have any chance to try sending the log files directly to elasticsearch and check if they are there in the right format?
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.