Why Are Bulk Insertions Not Being Round Robined?

thegrif · November 12, 2019, 8:14am

This question is cross-posted from Amazon's Elasticsearch Forum. (link) I'm unsure whether it's an Elasticsearch or an AWS issue - but would welcome any help/feedback this community can offer.

Bulk indexing requests pointed at a AWS ES domain's VPC endpoint are not being round-robined to each of the data nodes for processing. This can most obviously be seen by looking at the instance health dashboard: https://imgur.com/a/wdhlkhk

Domain settings:

5 r5.xlarge.elasticsearch nodes
1 AZ
General Purpose SSD (60GB per node)
Isolated in a private subnet

Full settings: https://imgur.com/a/vFOrueC

Background

I have a process that uses the Elasticsearch Bulk API to load data. I am pointing the process at the Elasticsearch VPC endpoint. I assumed that there was a load balancer between this endpoint and the data nodes - but I am looking at the individual health of each instance and it's clear that the indexing operations are only being handled by one of the nodes.

Additional settings I have configured to accelerate indexing:

replicas: 0
shards: 1
index refresh interval: -1
translog flush threshold size: 1024MB

Data Instance Health: https://imgur.com/a/wdhlkhk

Index Settings: https://gist.github.com/thegrif/64c3d5f1b4465cf13d3ddeaee4d2b9a2
Overall Stats: https://gist.github.com/thegrif/a43ff8fcfd83eca254d4f9a90dfb57d0
Node Info: https://gist.github.com/thegrif/fa55e72f3e995bbb8ba44f3a2dc51a57
Node Stats: https://gist.github.com/thegrif/acae53ffbfe7a6f154b175332779817f

dadoonet · November 12, 2019, 9:37am

I don't think it's related to elasticsearch.

BTW did you look at https://www.elastic.co/cloud and https://aws.amazon.com/marketplace/pp/B01N6YCISK ?

Cloud by elastic is one way to have access to all features, all managed by us. Think about what is there yet like Security, Monitoring, Reporting, SQL, Canvas, APM, Logs UI, Infra UI, SIEM, Maps UI and what is coming next ...

system · December 10, 2019, 9:37am

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
How the indexing requests will be routed to Data nodes from client node Elasticsearch	5	1506	March 3, 2017
Bulk indexing requests are mostly queued on one node in the cluster Elasticsearch	3	599	December 28, 2020
Bulk import unbalanced Elasticsearch	12	561	May 18, 2018
Bulk queue_size Elasticsearch	9	12709	July 5, 2017
Elasticsearch cluster spreading the bulk tasks Elasticsearch	7	953	July 6, 2017

Why Are Bulk Insertions Not Being Round Robined?

Related topics