I'm trying to index data from PostgreSQL to Elasticsearch using Logstash. The database is encoded in UTF-8. Here's an example of data:
nameöäå
First name with å
Second name with ä
Third name with ö
But when I'm trying to index the fields into Elasticsearch they come out as following:
"nameöäå": "First name with å",
"nameöäå": "Second name with ä",
"nameöäå": "Second name with ä",
I know that this is an issue with encodings but can't figure out the root cause. The PostgreSQL database is encoded in UTF-8 and Logstash should also default to UTF-8. My PostgreSQL version is 9.5, JDBC driver version is 42.2.4, Elasticsearch version is oss:6.3.1 and Logstash version is oss:6.3.1. I'm running the stack via docker-compose.
Here is my Logstash configuration file:
logstash.conf:
input {
jdbc {
jdbc_connection_string => "jdbc:postgresql://db:5432/mydb"
jdbc_user => "username"
jdbc_password => "password"
jdbc_driver_library => "/logstash_dir/postgresql-42.2.4.jar"
jdbc_driver_class => "org.postgresql.Driver"
type => "company"
statement => "SELECT * from company"
}
}
output {
elasticsearch {
index => "%{type}"
document_type => "%{type}"
hosts => ["http://elasticsearch:9200"]
}
}