Updating data which are newly added

Hi,
I am using jdbc connection for fetching data from database.

I have single index. Let say Contacts.

I am able to fetch data from database. But what i want is when i run logstash for second time it should only add data to elasticsearch which are new in database.

Explanation:

  1. Running logstash for first time
    ES will have 5 data in 'Contacts' index. Database has 5 data
  2. Running logstash for second time
    ES will have 11 data in 'Contacts' index. Database has 6 data

So what is happening is i am having multiple entries of same data. I want only newly added data when running logstash second tme.

How to achieve this ?

This has been discussed several times before, but the idea is to not use Elasticsearch's automatically generated document id but set your own document id based on one or more fields that originate from columns returned from the database query.

The elasticsearch output's document_id option can be used for this, and if your events for example get an id field with (typically) a primary key from the database you can use document_id => "%{id}" to use that id as the document id in ES.

Thanks @magnusbaeck.

Got more cleared information form the below post

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.