I'm migrating MYSQL data to ElasticSearch using logstash.
The table has more than 9 crore records. I have tested the query in mysql workbeanch which is taking 0.032 sec to execute.
When i run it from logstash, its taking more than 600 sec.
What will be the reason ?
Could you guys please help me ?
Sorry 9 crore records.
Actually i'm migrating XXX table from MYSQL to Elastic search using logstash. XXX table has 9 crore records set. While processing the 9 crore records using following script
page = 100000
select * from table_name where id between 0 and page;
its take 0.032 sec from MSQL Workbench.
But when i access same query through logstash its taking more than 6 minutes.
Could you please help me? Why its taking too much of time ?
I'm not talking about the entire process of indexing, i'm talking about running times of query alone.
When i run it from MYSQL workbench its talking 0.032 sec alone for 1 million records. The same query i'm running through logstash its taking more than 6.0 minutes . Fetching the result alone taking 6 minutes. That's why i'm wondering.
Is that 0.032 seconds the time to first results or the time to actually read out all the 1 million results? Are you verifying that you are reading out all records, e.g. by dumping them to a file?
Sorry,
for 100 thousands records its taking 0.032 sec in a single fetch.
input {
jdbc {
jdbc_driver_library = "mysql-connector-java-5.1.46.jar"
jdbc_driver_class = "com.mysql.jdbc.Driver"
jdbc_connection_string = "jdbc:mysql://localhost:3306/database"
jdbc_user ="abc"
jdbc_password = "xyz"
statement_filepath = "select * from table_name where id between 1 and 100000 "
}
}
The query execution itself taking 6 minutes when i run this conf from logstash.
But when i run it from MSQL Workbeanch its taking 0.032 sec
Why there is a big different ?
In your Logstash config, where are you sending the data? What does the rest of your configuration look like? Logstash will only read data as fast as the slowest downstream system can accept them, so that might be a bottleneck.
I found the issue. Actually i missed one index in query part. when i added the index in query it run fast now. But i'm still wondering, without index in workbench it got completed with in 0.032 sec for 1 million records. Bur when i configure through logstash, the query execution time alone taking around 6 minutes.
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.