High Rejections - bulk api

My elastic cluster:- version 7.2.0.
3 data nodes, 1 master node, 2 master+ingress nodes

Problem: I could see too many rejections on data nodes. So there is data loss
GET _nodes/moss-eck-es-data-2/stats/thread_pool?human&pretty
"write" : {
"threads" : 4,
"queue" : 0,
"active" : 0,
"rejected" : 1361456,
"largest" : 4,
"completed" : 3887827
}

My current shards in the cluster are around 720.

My use case: We are inserting data where the document has nearly 200 fields. we will frequently update the records in the same bulk request. In single bulk request it may contain data updates for multiple indices.

Can anyone please suggest here, what might be going wrong and what might help.

Thanks in advance

Hey,

this means that the bulk requests could not be processed, because the thread pools and their queue was filled up at the time of the index request. This information will be returned as part of hte bulk API, so the client can indicate what the next step should be - discarding those documents or waiting and trying a second time.

Judging by the number of threads, I assume that you got four cores. In order to to have faster writing you could either add more nodes or increase the number of cores - it is of course possibly, that you might hit another bottleneck like I/O.

Maybe you can talk a little bit more about your cluster. Is it only this node that has rejections or is your whole cluster under load or overloaded?

Also, what is a master+ingress node?

All the data nodes has rejections.

In our cluster, we had a total of 6 nodes.

3 acting as a dedicated data node
2 nodes acting as both master and ingress roles
1 dedicated master node

each node resource configurations: Java heap value: -Xms12g -Xmx12g and
Ram: 24gb
CPU: 4 cores
Disk: 400gb for data node and 150gb for other nodes

We are running a cluster in Kubernetes

so, how big and how many bulk requests are you sending per second?

The number of shards you are actively indexing into will affect how quickly queues fill up and you seem to have a quite large number of shards given your data volume. Are you able to reduce the number of shards?

We will push metrics data for every 15 min. For now, We are flushing nearly 10 to 50 bulk requests of size vary between 2MB to 64MB. The single bulk request will contain data, which will index documents (of nearly 200 fields) into 80 indices ( 2 shards and one replica). and bulk requests will also documents with the same ID multiple times (update scenario).

ok. I will try reducing the shards.

and one more doubt, it would be good performance wise, if we have 2 more data nodes in my use case??

are you sending all those bulk requests in parallel or ware you waiting for some to finish? reducing the number of shards with that number of nodes sounds like a good idea as well to me.

We are sending four requests in parallel (flush thread count: 4). And we are waiting for max 20s for response and most of the times our bulk request is getting timeout at client side.

Frequent updates of the same document can have a very negative impact on indexing performance at it can result in a large number of small flushes. Try to avoid this at any cost.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.