High Rejections - bulk api

bikkina_mahesh · January 23, 2020, 9:29am

My elastic cluster:- version 7.2.0.
3 data nodes, 1 master node, 2 master+ingress nodes

Problem: I could see too many rejections on data nodes. So there is data loss
GET _nodes/moss-eck-es-data-2/stats/thread_pool?human&pretty
"write" : {
"threads" : 4,
"queue" : 0,
"active" : 0,
"rejected" : 1361456,
"largest" : 4,
"completed" : 3887827
}

My current shards in the cluster are around 720.

My use case: We are inserting data where the document has nearly 200 fields. we will frequently update the records in the same bulk request. In single bulk request it may contain data updates for multiple indices.

Can anyone please suggest here, what might be going wrong and what might help.

Thanks in advance

spinscale · January 23, 2020, 10:40am

Hey,

this means that the bulk requests could not be processed, because the thread pools and their queue was filled up at the time of the index request. This information will be returned as part of hte bulk API, so the client can indicate what the next step should be - discarding those documents or waiting and trying a second time.

Judging by the number of threads, I assume that you got four cores. In order to to have faster writing you could either add more nodes or increase the number of cores - it is of course possibly, that you might hit another bottleneck like I/O.

Maybe you can talk a little bit more about your cluster. Is it only this node that has rejections or is your whole cluster under load or overloaded?

Also, what is a master+ingress node?

bikkina_mahesh · January 23, 2020, 10:55am

All the data nodes has rejections.

In our cluster, we had a total of 6 nodes.

3 acting as a dedicated data node
2 nodes acting as both master and ingress roles
1 dedicated master node

each node resource configurations: Java heap value: -Xms12g -Xmx12g and
Ram: 24gb
CPU: 4 cores
Disk: 400gb for data node and 150gb for other nodes

We are running a cluster in Kubernetes

spinscale · January 23, 2020, 1:20pm

so, how big and how many bulk requests are you sending per second?

Christian_Dahlqvist · January 23, 2020, 1:25pm

The number of shards you are actively indexing into will affect how quickly queues fill up and you seem to have a quite large number of shards given your data volume. Are you able to reduce the number of shards?

bikkina_mahesh · January 23, 2020, 2:26pm

We will push metrics data for every 15 min. For now, We are flushing nearly 10 to 50 bulk requests of size vary between 2MB to 64MB. The single bulk request will contain data, which will index documents (of nearly 200 fields) into 80 indices ( 2 shards and one replica). and bulk requests will also documents with the same ID multiple times (update scenario).

bikkina_mahesh · January 23, 2020, 2:39pm

ok. I will try reducing the shards.

and one more doubt, it would be good performance wise, if we have 2 more data nodes in my use case??

spinscale · January 23, 2020, 3:20pm

are you sending all those bulk requests in parallel or ware you waiting for some to finish? reducing the number of shards with that number of nodes sounds like a good idea as well to me.

bikkina_mahesh · January 23, 2020, 4:20pm

We are sending four requests in parallel (flush thread count: 4). And we are waiting for max 20s for response and most of the times our bulk request is getting timeout at client side.

Christian_Dahlqvist · January 23, 2020, 4:44pm

Frequent updates of the same document can have a very negative impact on indexing performance at it can result in a large number of small flushes. Try to avoid this at any cost.

system · February 20, 2020, 4:45pm

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Bulk queue_size Elasticsearch	9	12703	July 5, 2017
How to reduce thread-pool data rejection in elasticsearch cluster? Elasticsearch	24	8985	February 7, 2019
High bulk rejection on specific nodes Elasticsearch	6	1504	April 19, 2018
Cluster suddenly starts rejecting bulk requests due to thread pool queue exhaustion Elasticsearch	3	3700	May 8, 2017
Elasticsearch upgrade from elasticsearch 5.1.2 to 5.6.14, index operation tps rise cause bulk rejected Elasticsearch	12	789	June 21, 2019

High Rejections - bulk api

Related topics