Updating data

Hi! I have a problem with updating my data in elasticsearch.
I use a postgresql. When I transfer the data from database to elasticsearch through logstash, I have problem with update my data again and again.
If I have, for example, 1000 hits in database, across a few time I have 2000, and 3000, and ... 1000*n hist.
But, If I had in logstash OUTPUT in elasticsearch the field: document_id => "%{[ID]}", then I have only 1000 hits, but this hits updating again and again.
Please, give me the advice about this problem.

I don't understand what the problem is. How many rows do you want to have? What should happen with the rows you fetch from the database?

My logstash adding data from database again and again. If I have 100 rows in database, logstash adding 100 rows, then 100 more rows, etc.
If I have in logstash OUTPUT in elasticsearch the field: document_id => "%{[ID]}, then program update my data again and again. It is like 88 rows, f5 99 rows, f5 3 rows, f5 45 rows, etc.

So you want to update the document only when the corresponding database row has actually been updated?

If your table has a column with a "last modified" timestamp you can use the sql_last_value query parameter to select only those rows that have been modified since the last run (see the jdbc input documentation).

If you don't have such a column and you can't add one you're going to have to continue updating all documents all the time.

Logstash will not handle deletions of database rows.

What I need, if logstash adding to elastic data with each updating? excess data.
For example, I have 10 rows in database, but after upload it in elastic, with time, I will have 10, 20, 30,...., 200 rows, which are repeated.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.