Problem witch characters "㈰㈱ⴱ㈭ㄶㄶ㄰" in field Message

Hi Elastic team,

I have problem with some insertion in my logstash. I recieve a lot of row by second and some event give me "_dateparsefailure" i could see in debug mode that this issue es when the field message come the next data:

[2] "_dateparsefailure"
    ],
       "timestampepoch" => 1639682110703,
          "fechafinsub" => "",
              "Message" => "㈰㈱ⴱ㈭ㄶㄶ㄰〸㌱〰㈵ㄱã�´å�†ä¥“ä•’å��呒䌵㄰†††††††††††â��††〰〱〰〰㄰啓å•�剉伱††〰㌹〰〰††††㈰㈱ⴱ㈭ㄶ㈰㈱ⴱ㈭ㄶ†††å�†ä¥“ä•’åŒ â€ â�ƒä±…〰〰〰〰〰〰〰〰〰〰〰〰〰〰〰〠††䥓䕒ã�³ãŒ ††††††††††††††††††††††††††††††††††††††††††††††††††††††††††††††††††††††††††††††††††††††††††††††††††††††††††††††††††††††††††††††††‰〠††††††††††††††††††††††††††††††††††††††††††††††††††††††††††††††††††††††††††††††††††††††††††††††††††††††††††††††††††††††††††††††††††††††††††††††††††††††††††††††††††††††††††††††††††††††††††††††††††††††††††††††††††††††††††††††††††††††††††††††††††††††††††††††††††††††††††††††††††††††††††††††††††††††††††††††††††††††††††††††††††††††††††††††‰〰〰〰〰〠††††††††††††††††††††††††††††††††††††††††††††††††††††††††††††††††††††††††††††††††††††††††††††††††††††††††††††††††††††††††††††††††††††††††††††††††††††††††††††††††††††††††††††††††††††††††††††††††††††††††††††††††††††††††††††††††††††††††††††††††〰ㅓ†††††††††††††††††††††††††††††††††††††††††††††††††††‰〰ㄠ匰〱â��å‰�å� ä�’ä” å�’ä½”â�ƒå‰…ä�‰â€²ã€±ã„­ã€´â´±ã¤ ㈱〰ⴰㄭ〱â�“′〱ㄭ〴ⴱ㤭ㄵ⸲㔮㈳⸰㔵㔲㌯â�“〰㈠ä™�äµ‰ä±‰ä„ â�†ä…�䥌ä¥�††㈰ㄱⴰã�­ã„¹â€²ã„°ã€­ã€±â´°ã„ åŒ ãˆ°ã„±â´°ã�­ã„¹â´±ã”®ãŒ²â¸³ãˆ®ã€µã”¹ã”²â¼ 匰〳â�†å‰�å•„ä” â€ ä™’ä…•ä‘…â€ â€ â€²ã€±ã„­ã€´â´±ã¤ ãˆ±ã€°â´°ã„­ã€±â�“′〱ㄭ〴ⴱ㤭ㄵ⸳ã�®ã€³â¸°ã”¶ã€´ãŒ¯â�“〰ã� 䙕䱌††â�†å•Œä° †††㈰ㄱⴰã�­ã„¹â€²ã„°ã€­ã€±â´°ã„ åŒ ãˆ°ã„±â´°ã�­ã„¹â´±ã”®ãŒµâ¸²ã¤®ã€µã˜±ãˆ¹â¼ 匰〵â�”ä±�ä¬ ã„ â€ å‘Œäµ‹â€±â€ â€ â€²ã€±ã„­ã€´â´±ã¤ ãˆ±ã€°â´°ã„­ã€±â�“′〱ㄭ〴ⴱ㤭ㄵ⸳㜮ㄸ⸰㔶㈳㠯â�“〰㘠呌䵋′†â�”ä±�ä¬ ãˆ â€ â€ ãˆ°ã„±â´°ã�­ã„¹â€²ã„°ã€­ã€±â´°ã„ åŒ ãˆ°ã„±â´°ã�­ã„¹â´±ã”®ãŒ¸â¸²ã ®ã€µã˜³ã€¸â¼ 匰〷â�”ä±�ä¬ ãŒ â€ å‘Œäµ‹â€³â€ â€ â€²ã€±ã„­ã€´â´±ã¤ ãˆ±ã€°â´°ã„­ã€±â�“′〱ㄭ〴ⴱ㤭ㄵâ

I don't understand why happen this, because in the source log i don't see this characters , so i think that could be something in comunication label.

i have tried with the next input format:

input {

  tcp {
    codec => json_lines { charset => CP1252 }
    port => 5061
    tags => [ "tcpjson" ]
    # host => "nxlog"
  }
}

And my output is:

output {

  stdout { codec => rubydebug }
}

I apreciate your helps.

JSON is, by specification, Unicode. If you have CP1252-encoded strings, the charset converter will certainly do its best to transcode them into Unicode before the JSON parser in the codec kicks in, but you are by definition in unspecified behaviour territory. The characters outside the normal lower Unicode plane seem to indicate issues with character conversions.

That said, the _dateparsefailure comes from a date filter. Somewhere in your pipeline you have one configured to read a field, interpret its contents, and produce a timestamp. It puts this failure tag when the value in the field does not match any of the provided formats. This could be caused by the previously-mentioned encoding issues, or could be something else entirely.

Yes is exactly the codification issue, i understand that _dateparsefailure is generate for this codification issue, actually if i comment my code in date conversion my code works.

How can i solved this situation?

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.