What charset to use?


(fares oueslati) #1

Hi,

Im trying to read non UTF-8 file, it's IBM 500 file but i can't find
it in the charset list provided with the plugin so i tried "US-ASCII".

Here is my config file:

input{

       file {

            path =>...

            sincedb_path => "/dev/null"

            codec => plain{charset =>"US-ASCII"}

            start_position => "beginning"

            }

    }

output{stdout{}}

It doesn't work.

What i don't understand is that i don't have any error message, logstash starts and does nothing.

on my terminal i have only this message : "Logstash startup completed"

Any help please ?

Thanks


(Magnus Bäck) #2

With sincedb_path set to /dev/null and start_position set to "beginning" Logstash should indeed unconditionally read the file from the beginning. Increasing the logging verbosity by starting Logstash with --verbose or even --debug could give more clues about what it's doing. One possibility is that the Logstash process doesn't have permission to read the file, or that there's a typo in the filename or filename pattern.


(fares oueslati) #3

Thanks for your reply.

Logstash didn't do anything cause the encoded file was considered as a single line, so i added CR end of line and now it displays characters like these "����@@@@"

i tried almost all the charset values available here https://www.elastic.co/guide/en/logstash/current/plugins-codecs-plain.html

It doesn't seem to work.
With talend i can decode the file with IBM500 charset but it's not available in plain codec.

Any help please to do it with logstash ?


(system) #4