Including data from SQL tables to Elasticsearch

mar-ro · April 29, 2021, 9:51am

Good morning. I would like to know if it would be possible, in order to increase the performance of queries in a SQL database, to perform a data dump from the SQL database to elasticsearch, is that possible, or is it only possible to include data in non-SQL format? Thank you

dadoonet · April 29, 2021, 11:37am

You need to send your existing data to elasticsearch.
That means:

Read from the database (SELECT * from TABLE)
Convert each record to a JSON Document
Send the json document to elasticsearch, preferably using the _bulk API.

Logstash can help for that. But I'd recommend modifying the application layer if possible and send data to elasticsearch in the same "transaction" as you are sending your data to the database.

I shared most of my thoughts there: http://david.pilato.fr/blog/2015/05/09/advanced-search-for-your-legacy-application/

Have also a look at this "live coding" recording.

mar-ro · April 29, 2021, 11:54am

Thank you for your quick reply David.
I just need to know how I should treat the relationship between tables when entering the data in elasticsearch, I have seen that there is a field called "nested" but it is not entirely clear to me in which cases to use it... According to what I saw, when you want to query an array of objects associated to an object, you have to use the "nested" field so that when you flatten the array, the existing relationship between the data is not lost, is that right? If so, for example, if I wanted to store in elasticsearch an object that represented a house and in that object an array of objects that included the different appliances it includes (and their technical specifications) so that I could later search for those houses that had an appliance of a specific year, should I include the array of appliances as an object of type "nested" when declaring the index?
If so, would it be counterproductive to use the "nested" type versus a simple "object" type in the future if the volume of data increases too much?
Thanks in advance for your answer.
Best regards!

dadoonet · April 29, 2021, 2:15pm

I'd totally forget about the existing model and would just think about the usage to build the right "search" objects for my use case.
Here I can't really answer as I don't know the use case.

Basically I'd recommend to ask yourself 2 questions:

What kind of objects my users want to get back as a response? If it's object X, then just index object X
What typical attributes my users want to search for? Let say I need attribute a, b and c, just index those attributes within object X whatever the original source of those attributes is.

Most of the time you will need to denormalize your data. Doing joins at index time is one of the key.

mar-ro · April 29, 2021, 2:27pm

Ok, thank you very much for your answer David, you have cleared up my doubts. Best regards

system · May 27, 2021, 2:28pm

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
How to filter data while migrating from SQL to Elasticsearch using Logstash to obtain a nested strings array in a JSON object Logstash	1	438	October 5, 2021
Logstash converting data from a flattened table from a SQL query into JSON inner object Logstash	5	780	September 19, 2019
Periodically indexing SQL data to ES using Logstash Elasticsearch	5	731	November 8, 2017
How to use jdbc to import data into nested objects? Logstash	8	10039	November 4, 2022
Elasticsearch and sql tables Elasticsearch	7	443	November 25, 2018

Including data from SQL tables to Elasticsearch

Related topics