How to load big table to es with logstash

Hi :
I hava a mysql table named "orders" with ten million records ,I am trying to import all the data to Es with the plugin logstash-input-jdbc, but it seems cost a long time to finish this task, I have optimized all the params as far as I know,such as the mysql "useCursorFetch=true" ,"jdbc_fetch_size","jdbc_paging_enabled" ,but seems not work as I expected. Could anyone help me solve this problem;Thanks in advance;

my logstash config file is as below:

input {
stdin {
jdbc {
# mysql jdbc connection string to our backup databse
jdbc_connection_string => "jdbc:mysql://"
# the user we wish to excute our statement as
jdbc_user => "user"
jdbc_password => "test"
# the path to our downloaded jdbc driver
jdbc_driver_library => "/home/api/mysql-connector-java-5.1.25.jar"
# the name of the driver class for mysql
jdbc_driver_class => "com.mysql.jdbc.Driver"
jdbc_paging_enabled => "true"
jdbc_fetch_size => "50000"
statement_filepath => "../jdbc.sql"
schedule => "*/2 * * * * *"
type => "jdbc"

filter {
json {
source => "message"
remove_field => ["message"]

output {
elasticsearch {
hosts => ["localhost:9200"]
index => "test"
document_type => "orders"
document_id => "%{id}"

And the jdbc.sql file context is :
select * from orders

How long?

I have improted 3100000 records to es ,and takes about 50 minutes, the store.size on es is 2.1GB

What sort of monitoring do you have in place to tell if it's the DB or Elasticsearch/Logstash?

1 Like

I use the ELK tools with logstash-input-jdbc , there is no other monitoring tools.

I would recommend running the configuration once while replacing the elasticsearch output with e.g. a file output. That way you can check what the throughput of reading from the database is. Also monitor how much CPU you are using while doing this.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.