Updating data which are newly added

Varun_Patel · March 28, 2017, 5:14am

Hi,
I am using jdbc connection for fetching data from database.

I have single index. Let say Contacts.

I am able to fetch data from database. But what i want is when i run logstash for second time it should only add data to elasticsearch which are new in database.

Explanation:

Running logstash for first time
ES will have 5 data in 'Contacts' index. Database has 5 data
Running logstash for second time
ES will have 11 data in 'Contacts' index. Database has 6 data

So what is happening is i am having multiple entries of same data. I want only newly added data when running logstash second tme.

How to achieve this ?

magnusbaeck · March 28, 2017, 7:40am

This has been discussed several times before, but the idea is to not use Elasticsearch's automatically generated document id but set your own document id based on one or more fields that originate from columns returned from the database query.

The elasticsearch output's document_id option can be used for this, and if your events for example get an id field with (typically) a primary key from the database you can use document_id => "%{id}" to use that id as the document id in ES.

Varun_Patel · March 28, 2017, 10:23am

Thanks @magnusbaeck.

Got more cleared information form the below post

system · April 25, 2017, 10:23am

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Logstash adding duplicate rows for every run Logstash	11	14651	July 6, 2017
Adding extra data Logstash	8	548	September 7, 2018
Logstash and databse Logstash	16	5637	July 6, 2017
Updating data Logstash	6	1430	September 4, 2018
Logstash adding duplicate rows Logstash	2	845	June 13, 2017

Updating data which are newly added

Related topics