Im new to Elastic and i can't find a way to add big data (read 150k SQL rows) at once to my index.
Im using postman to execute the endpoints like
What i would like is to copy my 150k rows from my SQL Database Table and somehow get them into my index on Elasticsearch. What is the best way to do this?
Here is an example of 1 of how a row looks like in JSON as an object (information is blurred because it contains client information):
Please keep in mind that specially the stacktrace can be very long.
Any help is welcome! Thanks!
To import a large amount of data from a SQL database to Elasticsearch, you can use Logstash, which is a data collection pipeline tool. It can fetch data from various sources including SQL databases, transform it if necessary, and load it into Elasticsearch.
Here are the steps you can follow:
- Install Logstash: You can download it from the official Elasticsearch website.
- Create a Logstash configuration file: This file will define the input, filter, and output. In your case, the input will be your SQL database, and the output will be Elasticsearch. If you need to transform your data, you can define it in the filter section.
Here is a sample configuration file:
jdbc_driver_library => "/path/to/your/jdbc/driver"
jdbc_driver_class => "com.mysql.jdbc.Driver"
jdbc_connection_string => "jdbc:mysql://localhost:3306/yourdatabase"
jdbc_user => "yourusername"
jdbc_password => "yourpassword"
statement => "SELECT * from yourtable"
hosts => "localhost:9200"
index => "yourindex"
document_type => "yourtype"
- Run Logstash with your configuration file: You can do this with the following command:
bin/logstash -f /path/to/your/config/file
This will start the data import process. Logstash will fetch data from your SQL database and load it into Elasticsearch.
Please note that the above steps are a basic example. Depending on your specific needs, you might need to adjust the configuration file. For example, if your SQL query is complex, or if you need to transform your data before loading it into Elasticsearch.
Also, keep in mind that importing a large amount of data can take some time and might put a significant load on your Elasticsearch cluster. Therefore, it's recommended to monitor your cluster's health and performance during the import process
OpsGPT helped with part of this answer
This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.