We are ingesting some Avaya CDR data in our Elastic 8.11 and this is how the data looks like:
\u0001\u0001\u0001\u0002\a\u0001\u0000\u0006\u0001\u0001\u0002\u0001\u0006\b\u0001\u001E\x81
I see the below messages in my logstash logs:
[logstash.codecs.line ][avaya-cdr][7a861177e896f2b58696b6f2fa4a8cc43d896fc2089b4bc7d61d5a0f9dd3be96] Received an event that has a different character encoding than you configured. {:text=>"\u0001\u0001\u0001\u0002\a\u0001\u0000\u0006\u0001\u0001\u0002\u0001\u0006\b\u0001\u001E\x81", :expected_charset=>"UTF-8"}
I've tried to update the codec settings in my logstash filter file to ASCII-8BIT, UTF-8, ISO-8859-1 but none of these settings seem to change the data format. Is there something that I am missing?
This is how my input conf file looks like:
udp {
port => "xxx"
type => "avaya-cdr"
codec => line { charset => "ISO-8859-1"}
tags => ["avaya-cdr-udp-8516"]
}
tcp {
port => "xxx"
type => "avaya-cdr"
codec => line { charset => "ISO-8859-1"}
tags => ["avaya-cdr-tcp-8516"]
}
Please let me know if there is something else I need to do to translate the Avaya CDR data in Elastic.
I've never used them, but the documentation for the Avaya CDRs certainly suggests to me that they are plain ASCII. I don't think they use anything that would differ across ASCII / ISO8859-1 / UTF-8.
There an old PDF here. Start at page 1440. There is more current documentation here, and it doesn't look much different.
How is data getting from the output endpoint (CDR1) to logstash?
Somewhere in the documentation I have found they use UTF-16, which I cannot find now. You have already tried ASCII-8BIT, UTF-8, ISO-8859-1. Try with ASCII.
Try also to copy a file from /var/home/ftp/CDR and open in an editor. Notepad++ will show you the encoding in own menu.
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.