Elasticsearch does not index all documents

AlexisC · May 28, 2018, 2:13pm

Hello,

I tried to index a CSV document of 10 000 000 of lines in Elasticsearch via Bulk API. I succeed in and it works.
But, in the CSV file, I have several duplicates. So, I decided to put myself the _id of each document. And when I index the CSV document, I don't have 10 000 000 of lines but less -> logical.
What is less logical is that a first indexation gives me 9 989 339 documents, a second indexation 9 278 194, a third indexation 9 584 239 documents ... I never have the same number. Why is not working correctly ? What's wrong with my script ?
Thanks

Christian_Dahlqvist · May 28, 2018, 2:18pm

Have you run a refresh after completing indexing? Did you see any errors in the responses for the bulk requests?

AlexisC · May 28, 2018, 4:06pm

Yes, I did a refresh after completing indexing and I have no errors in the responses from the bulk requests.

dadoonet · May 29, 2018, 4:09am

What is your script?

system · June 26, 2018, 4:09am

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Document lost or not indexed during bulk index Elasticsearch	4	1677	July 23, 2020
All ids Elasticsearch	3	7412	July 6, 2017
Indexing speed slowdown Elasticsearch	12	894	October 19, 2017
Documents Not Indexed Elasticsearch	9	9128	December 12, 2016
_bulk not indexing all documents? Elasticsearch	5	332	July 6, 2017

Elasticsearch does not index all documents

Related topics