Load unfairly distributed during large ingest

yylex · April 9, 2017, 10:53pm

Hey friends, new guy here.

I have an ES cluster running on cheap spot instances: 1 master, 5 data nodes, 10 shards, and 1 replica. I have a round robin load balancer in front of my cluster.

During ingest, the load is heavily concentrated on two (small) nodes. I can't figure out why this is the case, as I have shards assigned to every node. Can anyone help me troubleshoot this?

nik9000 · April 9, 2017, 11:46pm

Hard to tell from the picture. Are you just importing or do you have an
ingest processor or something? Those nodes look like they have more disk
usage, I wonder if you have a hot spot created by something like
parent/child.

yylex · April 10, 2017, 12:27am

Hey Nik, thanks for the reply.

I am just running an import, using a small Spark cluster to shove data into my ES cluster.

The two servers with high load do have less disk space. They were also the first data nodes to join the cluster, although I can't imagine why that matters. I don't have any custom routing.

I'm more of a data scientist than an infra guy, so I'm quite stumped!

nik9000 · April 10, 2017, 1:01am

I tend to use the hot_threads API to have a look at the guts and see if I
see a smoking gun in situations like this. If you post a gist of that I can
have a look.

yylex · April 10, 2017, 8:44am

Nik!

Thanks for pointing me in the right direction. I looked into hot_threads, entered a few other rabbit holes, and discovered that two of my shards were stuck "relocating." After I solved that issue, the cluster was able to ingest at 35k documents/sec, with a uniform load distribution.

Thanks for taking a moment to help the new guy.

nik9000 · April 10, 2017, 4:05pm

Thanks for looking in hot_threads! Glad that solved your issue.

system · May 8, 2017, 4:19pm

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
ES Ingestion Performance Issue Elasticsearch	2	302	March 15, 2019
Ingestion from ES-Spark to Ingestion node Elasticsearch es-hadoop	8	654	October 9, 2020
Elastic ingest node load not balanced with pipeline Elasticsearch ingest-pipeline	10	805	May 17, 2022
Slow ingestion problem (v 6.2.3) Elasticsearch	14	3649	July 22, 2018
ES Cloud - Running ingest pipelines on warm nodes? Elasticsearch ingest-pipeline	41	1653	February 4, 2021

Load unfairly distributed during large ingest

Related topics