manjsr
(Manoj Kumar)
June 3, 2019, 8:57am
1
I'm new here even for ELK.
Just trying to use Filebeat to collect XML log and push it to Kafka but Filebeat returns Unicode character code instead of XML tag symbol under message, it is showing like \u003c and \u003e.
I'm using filebeat 6.0.0
My XML's logs look like this:
My filebeat.yml looks like this:
Output getting at Kafka like below
{"@timestamp ":"2019-06-03T09:08:43.410Z","@metadata ":{"beat":"filebeat","type":"doc","version":"6.6.0","topic":"Topic1"},"beat":{"hostname":"LP-5CD812F3VC","version":"6.6.0","name":"LP-5CD812F3VC"},"host":{"name":"LP-5CD812F3VC"},"offset":0,"log":{"file":{"path":"C:\WorkSpace\Filebeat\logs\employees - Copy - Copy (2).xml"},"flags":["multiline"]},"message":"\u003cemployees\u003e\n \u003cemployee id="111"\u003e\n \u003cfirstName\u003eManoj\u003c/firstName\u003e\n \u003clastName\u003eSinha\u003c/lastName\u003e\n \u003clocation\u003eIndia\u003c/location\u003e\n \u003c/employee\u003e\n \u003cemployee id="222"\u003e\n \u003cfirstName\u003eAlex\u003c/firstName\u003e\n \u003clastName\u003eGussin\u003c/lastName\u003e\n \u003clocation\u003eRussia\u003c/location\u003e\n \u003c/employee\u003e\n \u003cemployee id="333"\u003e\n \u003cfirstName\u003eDavid\u003c/firstName\u003e\n \u003clastName\u003eFeezor\u003c/lastName\u003e\n \u003clocation\u003eUSA\u003c/location\u003e\n \u003c/employee\u003e","source":"C:\WorkSpace\Filebeat\logs\employees - Copy - Copy (2).xml","prospector":{"type":"log"},"input":{"type":"log"}}
Can anyone please look into this issue and help me quickly to getting out proper XML tag symbol instead Unicode character code at Filebeat?
Regards, Manoj
kvch
(Noémi Ványi)
June 3, 2019, 10:00am
2
Do not use screenshots when sharing text. Please paste your configuration here as text and format it using </>
.
manjsr
(Manoj Kumar)
June 3, 2019, 10:09am
3
Okay... Thanks. configuration as text.
Input XML file/logs
<employees>
<employee id="111">
<firstName>Manoj</firstName>
<lastName>Sinha</lastName>
<location>India</location>
</employee>
<employee id="222">
<firstName>Alex</firstName>
<lastName>Gussin</lastName>
<location>Russia</location>
</employee>
<employee id="333">
<firstName>David</firstName>
<lastName>Feezor</lastName>
<location>USA</location>
</employee>
</employees>
My filebeat.yml
filebeat.inputs:
- type: log
enabled: true
paths:
- C:\WorkSpace\Filebeat\logs\*.xml
input_type: log
document_type: xml
encoding: UTF-8
multiline.pattern: '^[[:space:]]'
multiline.negate: false
multiline.match: after
#----------------------------- Kafka output --------------------------------
output.kafka:
hosts: ["localhost:9092"]
topic: "Topic1"
Output getting at Kafka like below
{"@timestamp":"2019-06-03T08:44:16.900Z","@metadata":{"beat":"filebeat","type":"doc","version":"6.6.0","topic":"Topic1"},"offset":0,"log":{"file":{"path":"C:\\WorkSpace\\Filebeat\\logs\\employees - Copy - Copy.xml"},"flags":["multiline"]},"message":"\u003cemployees\u003e\n \u003cemployee id=\"111\"\u003e\n \u003cfirstName\u003eManoj\u003c/firstName\u003e\n \u003clastName\u003eSinha\u003c/lastName\u003e\n \u003clocation\u003eIndia\u003c/location\u003e\n \u003c/employee\u003e\n \u003cemployee id=\"222\"\u003e\n \u003cfirstName\u003eAlex\u003c/firstName\u003e\n \u003clastName\u003eGussin\u003c/lastName\u003e\n \u003clocation\u003eRussia\u003c/location\u003e\n \u003c/employee\u003e\n \u003cemployee id=\"333\"\u003e\n \u003cfirstName\u003eDavid\u003c/firstName\u003e\n \u003clastName\u003eFeezor\u003c/lastName\u003e\n \u003clocation\u003eUSA\u003c/location\u003e\n \u003c/employee\u003e","prospector":{"type":"log"},"input":{"type":"log"},"host":{"name":"LP-5CD812F3VC"},"beat":{"hostname":"LP-5CD812F3VC","version":"6.6.0","name":"LP-5CD812F3VC"},"source":"C:\\WorkSpace\\Filebeat\\logs\\employees - Copy - Copy.xml"}
Can any one please have a look into this issue and help me quickly to getting out proper XML tag symbol instead Unicode character code at Filebeat?
@magnusbaeck , @andrewkroh , @pierhugues @ruflin Can you please help me on this quickly?
Regards, Manoj
Looks like you need to change the UTF encoding since you are on windows...
encoding: "utf-16le"
manjsr
(Manoj Kumar)
June 4, 2019, 5:24am
5
I tried every possible combination but filebeat always returns Unicode character code instead of XML tag symbol under message. Even I tried on Linux as well but issue persist there as well.
"message":" \u003cGroupVersion\u003e0\u003c/GroupVersion\u003e"}
Any help would be greatly appreciated.. Thanks
what version of beats are you using? I found this thread that talks about filebeat behavior in possibly older versions...
I'm in the process of setting up filebeat to ship out logs on some of my servers. One of my logs has angle brackets in it ("<" and ">"). Once filebeat processes it and outputs its JSON representation, those angle brackets have been replaced with \u003c and \u003e, respectively.
Sample log:
Sep 15 17:49:02 [26263] <warning> [rest of log omitted]
JSON output:
{
"@timestamp":"2016-09-16T01:06:24.394Z",
"beat":{
"hostname":"[redacted]",
"name":"[redacted]"
},
"input_type…
manjsr
(Manoj Kumar)
June 4, 2019, 1:55pm
7
@elastikip , I'm using Filebeat 6.6.0 version. Kindly help me for this issue
{"beat":"filebeat","type":"doc","version":"6.6.0","topic":"Topic1"}
Regards, Manoj
manjsr
(Manoj Kumar)
June 6, 2019, 5:58am
8
Can someone look into this issue and help me to figure out. Any help would be greatly appreciated.
@magnusbaeck , @andrewkroh , @pierhugues @ruflin
Many thanks, Manoj
manjsr
(Manoj Kumar)
June 7, 2019, 7:06am
9
Hi,
@tylerjl , @warkolm , @abdon , @Kosho_Owa , @casper
Can someone look into this issue and help me to figure out. Any help would be greatly appreciated.
Regards, Manoj
kvch
(Noémi Ványi)
June 7, 2019, 12:08pm
10
Have you tried setting the encoding correctly? The accepted UTF-8 encodings are utf8
or utf-.8
.
Newer versions have an output.elasticsearch.escape_html
config option that you can set to false
. I think this would help. In Filebeat 7.0 this defaults to false, but earlier versions had it enabled by default.
https://www.elastic.co/guide/en/beats/filebeat/6.4/elasticsearch-output.html#_literal_escape_html_literal
manjsr
(Manoj Kumar)
June 10, 2019, 7:23am
12
Thanks @andrewkroh for your suggestion.
I make it escape_html: false like below but still getting same issue. I'm now for filebeat, can you please let me know if i'm missing anything? Just for your referance i'm
useing Filebeat to collect XML log and push it to Kafka topic.
#----------------------------- Kafka output --------------------------------
output.kafka:
# initial brokers for reading cluster metadata
hosts: ["localhost:9092"]
# message topic selection + partitioning
topic: "Topic1"
codec.json:
pretty: true
escape_html: false
Output logs at Kafka topic:
message": "\u003cemployees\u003e\n \u003cemployee id=\"111\"\u003e\n \u003cfirstName\u003eLokesh\u003c/firstName\u003e\n \u003clastName\u003eGupta\u003c/lastName\u003e\n \u003clocation\u003eIndia\u003c/location\u003e\n \u003c/employee\u003e\n \u003cemployee id=\"222\"\u003e\n \u003cfirstName\u003eAlex\u003c/firstName\u003e\n \u003clastName\u003eGussin\u003c/lastName\u003e\n \u003clocation\u003eRussia\u003c/location\u003e\n \u003c/employee\u003e\n \u003cemployee id=\"333\"\u003e\n \u003cfirstName\u003eDavid\u003c/firstName\u003e\n \u003clastName\u003eFeezor\u003c/lastName\u003e\n \u003clocation\u003eUSA\u003c/location\u003e\n \u003c/employee\u003e",
"source": "C:\\WorkSpace\\logs\\employees - Copy.xml",
Seems like the escape_html
setting isn't have any effect in 6.x. I tried 7.1.1 and those escape characters went away.
manjsr
(Manoj Kumar)
June 11, 2019, 12:37pm
14
Thanks @andrewkroh !!!
Unicode character issue fixed in the latest version of fileBeat. I tested locally with version 7.0.1 (filebeat-7.0.1-windows-x86_64) with same configuration (v 6.6.0 yml file) and getting expected XML tag at Kafka topic.
But, I have to use filebeat v6.6.0 only and still not able figure out this issue
Regards, Manoj
I think opening a bug report on Github is the next step. I think that the escape_html
is not being honored and some debugging is required.
system
(system)
Closed
July 9, 2019, 6:54pm
16
This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.