Hi,
I am running Logstash to insert data from MySQL database to Elasticsearch.
In MySQL database I have one-to-many relationship so one value from table A has zero or more values in table B.
I have made the configuration and the suitable SQL statement and I am using jdbc logstash input plugin to do this.
my config file looks like:
input {
jdbc {
jdbc_driver_library => "./mysql-connector-java-5.1.45-bin.jar"
jdbc_driver_class => "com.mysql.jdbc.Driver"
jdbc_connection_string => "jdbc:mysql://myhost:portnum/db_name"
jdbc_user => "name"
jdbc_password => "secret"
schedule => "* * * * *"
statement => "SELECT a, b
FROM A
JOIN B
ON A.b_id = B.id
GROUP BY a, b"
}
}
output {
elasticsearch {
action => 'update'
document_id => "%{a}"
"hosts" => "http://myEShost:portnumber"
"index" => "my_index"
}
stdout {
codec => json_lines
}
}
So I use field "a" as _id in my index, however, for each "a" I have multiple "b" and I want all these values to be inserted as an array in field "b" in my index.
I managed to solve this by changing my output script field to if (ctx._source.b == null) { ctx._source.b = new ArrayList(); } if(!ctx._source.b.contains("%{b}")) { ctx._source.b.add("%{b}"); }
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.