Documents in elasticsearch getting deleted automatically?


(Kulasangar Gowrisangar) #1

I'm creating an index though logstash and pushing data to it from a MySQL database. But what I noticed in elasticsearch was once the whole data is uploaded, it starts deleting some of the docs. The total number of docs is 160729. Without the scheduler it works fine.

I inserted the cron scheduler in order to check whether new rows have been added to the table. Can that be the issue?

My logstash conf looks like this.

Where am I going wrong? Or is this behavior common?

Any help could be appreciated.


(Christian Dahlqvist) #2

If you are assigning document IDs based on the data in the database and therefore end up updating documents in Elasticsearch, this will result in the old version of the document being listed as deleted.


(Kulasangar Gowrisangar) #3

@Christian_Dahlqvist

Yes the document_id which I've given is the primary key (id) of the table.

So will that cause a data loss to my total docs? Or can that be ignored?


(Christian Dahlqvist) #4

It is how updates are reported in the stats, and is not an indication of data loss.


(Kulasangar Gowrisangar) #5

@Christian_Dahlqvist Oh that sounds great. :slight_smile:

So i don't have to fear about this? Am I doin it in the right way?

Thanks.


(Christian Dahlqvist) #6

If records in the database are getting updated, it looks like you are catching this. Whether you are catching this correctly or not I can't tell as that depends on your data model and how you are extracting changes.


(Mark Walkom) #7

Have a look at https://www.elastic.co/guide/en/elasticsearch/guide/2.x/dynamic-indices.html#deletes-and-updates as well.


(system) #8