UDP input codec

Hi,
I have an NB-IOT sensor that outputs UDP packets containing some data.
I used the default UDP input plugin and default plain codec but It does not decode the packet's UDP payload the way I want it to (headers work fine).

Using WIRESHARK, I noticed that the sensor's payload data is sent to logstash as raw HEX and not ASCII-HEX values (contrary to what the other sensors I used).

I tried to find another plugin or codec which does not expect ASCII-HEX values but with no luck...
Do you have any codec or plugin to achieve this ? It seems fairly basic so I'm probably not the first one to have this issue ?

Here is a more detailed version of what is happening:
The UDP payload data the sensor outputs is : 868963044646776002000000044684000000010000000012253103 (HEX values visible on WIRESHARK)
What my input config outputs is :
"\x86\x89c\u0004FFw`\u0002\u0000\u0000\u0000\u0004F\x84\u0000\u0000\u0000\u0001\u0000\u0000\u0000\u0000\u0012%1\u0003"
What I would like my input config output to look like :
"868963044646776002000000044684000000010000000012253103" as a string.

Thanks for any help :slight_smile:

Best regards,

Laurent

It's related to codec, default is UTF-8. Try to add one of ASCII, ISO-8859-1,US-ASCII, Windows-1252 or CP1252. I have try to convert your sample with several codec but no success. Might be because UTF-8 already covert wrongly.

Add this in your input:

	codec => plain {
		charset => "ASCII"
	}

I think you will need to do this in two parts. You could try using

codec => plain { charset => "ASCII-8BIT" }

ASCII-8BIT (a.k.a BINARY) will just consume the input in 1 byte pieces. You then want a string representation of it. To do that you will need a ruby filter. Probably a string unpack to convert the string to an array of bytes, then iterate over the array to append each one in hex to a string.

Hello,
thank you for your reply. I tried implementing Badger's solution and here is my conf :

input {

  udp{
    port => 7979
    tags => ["nbiot","udp"]
    codec => plain { charset => "ASCII-8BIT" }
  }

}

filter {
  if "nbiot" in [tags] {
    ruby {
        code =>'
                stri = event.get("[message]")
                event.set("[o][data][messageSize]", stri.size)
                event.set("[o][data][unpack_H*]", stri.unpack("H*"))
             '
    }
  } 
}

I do not know if this is what you intended me to do @Badger but this is what I understood :slight_smile:

The issue is that it does not output what I expect. For example when 0x86 is sent we get 0xefbfbd which seems to match UTF-8 "Replacement character". What I understand is that if that if the input is not recognized as a characher (from the ASCII-8bit table I guess) it is replaced with something else. But again, I only need to get the original binary value (I do not need any character interpretation).

Any idea? I can't believe that I need to create my own codec/plugin for that?

Sorry, I have no idea. I was guessing what you needed to do. Basically you want logstash to consume binary data and it is fundamentally not intended to do that. There may be options that allow it but it is not something I have ever tried.

Similar topic has been opened here

Can you check with tcpdump, which hex characters and visible are send?
Also try to save in a dump file and open in Notepad++ to see the character set. Maybe I'm going in a wrong direction, but I don't see any problem except charset.

Hi,

Sorry the link you sent is broken.

I managed to solve my problem by modifying my local version of the plain codec (very very dirty) to completely bypass the charset transcoding.
I intend to write a specific codec plugin to properly address the issue (and revert my plain codec to its official version). I tried today but it does not work yet. I intend to open a specific topic about it. I'll try to remember to link the solution once it is done.

Thank you everyone :slight_smile:
Best regards,

Laurent

How? By converting hex byte to ASCII? What was difference between 868963044646776002000000044684000000010000000012253103 and what Logstash receive?