Insert and Update records in Elasticsearch

Hi,

I am looping through each record in elasticsearch and inserting the new records and updating the existing records. Say for example, i have a csv file which has 5000 records, only few are updated in the index and the rest is missed. And my script to insert and update runs for every 5 mins. Is there a way to insert all the records without missing it?

How do you know there are missing? Did you refresh the index before searching? How are you inserting data?

I am inserting the data as a dataframe using python.
The number of rows in the csv file is not the same in the index after the insertion.
No, i dont refresh the index

So 3 things are coming to my mind:

  • You are using the same _id for some documents so some documents are updated
  • There are errors for some documents but you are not looking at them
  • You are calling _search immediately after the last index operation and because you are not refreshing the index manually, the last batch of documents is not searchable yet.

Thanks for the response. I have added the refresh part in my script.

Hi,

I added refresh. But what I found is only at a particular time of the day, the data is missed from insert.

Index has 50,000 to 2 lakh records and each run will insert a minimum of 2000 records.

Then the other two options I mentioned are still valid.