Python SocketHandler charset

Hello,

I'm using Python SocketHandler to send logs over UDP port to Logstash.
My Logstash input config is like below: -

input {
  udp {
    port => 33333
    type => "test"
  }
}

And my python logging config is as below: -

[loggers]
keys=root

[handlers]
keys=udpHandler

[logger_root]
level=DEBUG
handlers=udpHandler

[handler_udpHandler]
class=handlers.DatagramHandler
level=DEBUG
args=('elk.internal.com', 33333)

If I don't use any codec or charset in my logstash config, I'm getting below error in logstash logs: -

[2017-10-25T10:15:27,175][WARN ][logstash.codecs.plain    ] Received an event that has a different character encoding than you configured. {:text=>"\\u0000\\u0000\\u0002!}q\\u0000(X\\u0004\\u0000\\u0000\\u0000nameq\\u0001X\\u0004\\u0000\\u0000\\u0000rootq\\u0002X\\u0003\\u0000\\u0000\\u0000msgq\\u0003X\\u000F\\u0000\\u0000\\u0000Program startedq\\u0004X\\u0004\\u0000\\u0000\\u0000argsq\\u0005NX\\t\\u0000\\u0000\\u0000levelnameq\\u0006X\\u0004\\u0000\\u0000\\u0000INFOq\\aX\\a\\u0000\\u0000\\u0000levelnoq\\bK\\u0014X\\b\\u0000\\u0000\\u0000pathnameq\\tX7\\u0000\\u0000\\u0000/Users/ajitbansode/PycharmProjects/mypy/logging_main.pyq\\nX\\b\\u0000\\u0000\\u0000filenameq\\vX\\u000F\\u0000\\u0000\\u0000logging_main.pyq\\fX\\u0006\\u0000\\u0000\\u0000moduleq\\rX\\f\\u0000\\u0000\\u0000logging_mainq\\u000EX\\b\\u0000\\u0000\\u0000exc_infoq\\u000FNX\\b\\u0000\\u0000\\u0000exc_textq\\u0010NX\\n\\u0000\\u0000\\u0000stack_infoq\\u0011NX\\u0006\\u0000\\u0000\\u0000linenoq\\u0012K\\u000EX\\b\\u0000\\u0000\\u0000funcNameq\\u0013X\\u0004\\u0000\\u0000\\u0000mainq\\u0014X\\a\\u0000\\u0000\\u0000createdq\\u0015GA\\xD6|\\u0012\\a˞\\xA1X\\u0005\\u0000\\u0000\\u0000msecsq\\u0016G@f\\xB1\\xD2t\\u0000\\u0000\\u0000X\\u000F\\u0000\\u0000\\u0000relativeCreatedq\\u0017G@1\\xF0\\xE5@\\u0000\\u0000\\u0000X\\u0006\\u0000\\u0000\\u0000threadq\\u0018L140736339637184L\\nX\\n\\u0000\\u0000\\u0000threadNameq\\u0019X\\n\\u0000\\u0000\\u0000MainThreadq\\u001AX\\v\\u0000\\u0000\\u0000processNameq\\eX\\v\\u0000\\u0000\\u0000MainProcessq\\u001CX\\a\\u0000\\u0000\\u0000processq\\u001DM\\x8F\\u0003u.", :expected_charset=>"UTF-8"}

If I specify ASCII charset with plain codec, I'm not getting any error but my log is also not quite readable.
With ASCII charset, record in ES looks like below: -

{
"_index": "udp-2017.10.25",
"_type": "test",
"_id": "AV9Si2NNLRwg1kYYNA7q",
"_version": 1,
"_score": 1,
"_source": {
"@timestamp": "2017-10-25T07:59:47.307Z",
"@version": "1",
"host": "192.168.56.13",
"message": "}q(XnameqXrootqXmsgqXDone!qXargsqNX levelnameqXINFOqaXalevelnoqKXpathnameq X7/Users/ajitbansode/PycharmProjects/mypy/logging_main.pyq XfilenameqXlogging_main.pyqXmoduleq
Xlogging_mainqXexc_infoqNXexc_textqNX stack_infoqNXlinenoqKXfuncNameqXmainqXacreatedqGA�|���XmsecsqG@sT9XrelativeCreatedqG@3�c`XthreadqL140736339637184L X threadNameqX MainThreadqXprocessNameqeXMainProcessqXaprocessqMRu.",
"type": "test"
}
}

Please help in this regard.
Thanks.

-Ajit

DatagramHandler writes the log record in Python's pickle format. You should be able to subclass DatagramHandler and override its emit() method to e.g. serialize the record as JSON instead.

@magnusbaeck
Hmm, so there is no way to go ahead with pickle format.
Thanks for the reply.

We'll go ahead with our own implementation of DatagramHandler.

Cheers,
-Ajit

Or just use any of the existing libraries:



This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.