Hi,
I am currently using Logstash and SQL Server JDBC to import data from SQL Server to Elasticsearch. The problem i am facing is that the SQL table has 4 million records in it and it takes forever to just upload a 100K records, is there a faster way for me to index the SQL data into Elasticsearch.
Note: I have checked river plugin and it has been removed from new version of Elasticsearch,
I tried JDBC importer but it does not work(If anyone can give me instructions on how to use this for SQL Server i would really appreciate it.)
Have you identified what is limiting performance? How quickly are you able to read data from the database if you do not index them into Elasticsearch? How quickly are you able to index data into your cluster if you feed from a file instead of reading from the database?
Hi Christian,
I have tried using a csv file today and even that is taking a lot of time, i have been running it for over an hour and it seems to be doing the same thing,
Regarding checking what is taking a lot of time can you please point me in the direction where i can gather information on how to do it.
What is the specification of your cluster/node (CPUs/RAM/Storage/Heap)? How large are you documents? What indexing throughput are you seeing? What bulk size are you using? Which version of Elasticsearch and Logstash?
i am on the same boat as your. i have been trying to import data in to Elastic search . i would really appreciate,if you could please shed some light on this. i had followed below link, but didn`t really work. http://hintdesk.com/how-to-connect-elasticsearch-to-ms-sql-server/
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.