CSV with fields in Portuguese

(Sharon Sasporta) #1

I have a csv file with some of the fields in Portuguese.

In the input of logstash I wrote: codec => plain { charset=>"UTF-8" }

I am getting the following warnings:

[2019-02-16T14:38:25,876][WARN ][logstash.codecs.plain ] Received an event that has a different character encoding than you configured. {:text=>"INC000569147,Designado,COPM do Brasil,Produ\\xE7\\xE3o,Triagem Back End,2019-01-21 07:00:38,,,,\\r", :expected_charset=>"UTF-8"}

I tried to solve it in the filter with ruby:

 ruby {
        code => "
                 unless (event.get('Classification').to_s).nil?
                      event.set('Classification', ((event.get('Classification')).to_s).force_encoding('ISO-8859-1'))
                 unless (event.get('Organization Support Group').to_s).nil?
                      event.set('Organization Support Group', ((event.get('Organization Support Group')).to_s).force_encoding('ISO-8859-1'))
                 unless (event.get('Designated Group').to_s).nil?
                      event.set('Designated Group', ((event.get('Designated Group')).to_s).force_encoding('ISO-8859-1'))

but I am still getting the same warnings.

Any better Idea how to solve it?



Produção? E7 E3 is not UTF-8, UTF-8 would be \xC3\xA7\xC3\xA3. It might be ASCII-8BIT.

(Sharon Sasporta) #3

I did it and all the warnings stop.

I can still see some gibberish:

Designated Group Monitora��o Neg�cio

The real value is: Monitoração Negócio

He isn't able to present the Portuguese special characters.


(system) closed #4

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.