Bulk Request not fully indexing data in synchronised request

vishaliiita · December 7, 2016, 1:12pm

Hi,

I have been trying to index a huge amount of data about a million documents,
There were two options that i tried,

I am using elasticsearch 2.3.3.

1 ) Asynchronous , which overloaded the ES, since the client was firing index request with 100 document in one bulk request, soo rapidly for the ES unable to keep up at all , resulting that ES started saying IndexRequestRejected , for every request after few seconds , it just indexed first 9-10K documents and then started rejecting every request, even my system got too slow, ate up all the memory in the process.

then I changed , bulk.queue size to 1000 in elasticsearch.yml , no luck, I think this too got full with rapid request fired

Then I put the Thread.sleep(500) (half a second which is a lot slower) , all data was indexed , except for 10 documents , saying the string size of one of the fields was > 32766 UTF8 , I did took into account, and m ok with it.

But putting a sleep in your API is never a good practice, So I dropped that idea.

Then i switched to Synchronous , which started with a rapid request firing at first , but then slowed down to wait for the ES to index the documents, and then stayed that way , until the last request,

But then i noticed the documents that were indexed were almost around half of the total , I don't know what is issue here ? ,

It didn't print any error logs in log file , not of the request were failed except for some , cause of the string size of one of the fields was > 32766 UTF8 , etc,

I read a whole lot about this on net , n then tried a few things ,

For eg. ,

set replica = 0,
increased the shards to 20 (i don't know if shards no. have anything to do with data loss, i think its only for parallel indexing and should not affect the data consistency),
set refresh time to -1 , and then refreshed after it completed..

but STILL it only indexes half of the data.

M totally confused now ,thinking what to do over this past week

PLEASE HELP!!!

system · January 4, 2017, 1:12pm

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Strange issue with Elasticsearch while bulk indexing Elasticsearch	3	633	March 2, 2020
Elasticsearch bulk indexing issue Elasticsearch	9	4144	March 3, 2020
Request volume management Elasticsearch	7	447	April 14, 2020
Bulk api queue becomes full Elasticsearch	4	3694	July 5, 2017
Bulk import response times Elasticsearch	4	1795	July 5, 2017

Bulk Request not fully indexing data in synchronised request

Related topics