Shard check on startup is slow

vpunski · November 3, 2011, 9:01am

I tried to use index.shard.check_on_startup in order to check the
consistency of shard.
For unknown reason, this process is very slow.
From short investigation of system resources, I see that CPU usage is
very low, but it doesn't seems to be IO intensive.
My discs read about 20MB/sec, when using "hdparam -t /dev/sda" it gets
about 100MB/sec.

What is the reason of so slow loading?

Current shard configuration:
{
"cluster_name" : "MY_CLUSTER",
"status" : "green",
"timed_out" : false,
"number_of_nodes" : 10,
"number_of_data_nodes" : 10,
"active_primary_shards" : 10,
"active_shards" : 30,
"relocating_shards" : 0,
"initializing_shards" : 0,
"unassigned_shards" : 0

}

kimchy · November 3, 2011, 5:15pm

I mentioned before that this is a slow process. It needs to check and read
quite a lot of data from the index.

On Thu, Nov 3, 2011 at 11:01 AM, vadim vpunski@gmail.com wrote:

I tried to use index.shard.check_on_startup in order to check the
consistency of shard.
For unknown reason, this process is very slow.
From short investigation of system resources, I see that CPU usage is
very low, but it doesn't seems to be IO intensive.
My discs read about 20MB/sec, when using "hdparam -t /dev/sda" it gets
about 100MB/sec.

What is the reason of so slow loading?

Current shard configuration:
{
"cluster_name" : "MY_CLUSTER",
"status" : "green",
"timed_out" : false,
"number_of_nodes" : 10,
"number_of_data_nodes" : 10,
"active_primary_shards" : 10,
"active_shards" : 30,
"relocating_shards" : 0,
"initializing_shards" : 0,
"unassigned_shards" : 0

}

vpunski · November 4, 2011, 8:20am

Yes, you've mentioned it before. Now I'd like to understand the actual
reason of slow read rate (from IO perspective), not the "slow process". By
reason I mean for example non sequential file read requiered by Lucene
index check process. Anyway, I'd like to understand the real reason and try
to solve it.

From what I've seen in the code, as all the shards started in parallel,
their "check on startup" process creates race condition on hardware storage
(disc).

Am I missing something.

Thanks
On Nov 3, 2011 7:15 PM, "Shay Banon" kimchy@gmail.com wrote:

I mentioned before that this is a slow process. It needs to check and read
quite a lot of data from the index.

On Thu, Nov 3, 2011 at 11:01 AM, vadim vpunski@gmail.com wrote:

I tried to use index.shard.check_on_startup in order to check the
consistency of shard.
For unknown reason, this process is very slow.
From short investigation of system resources, I see that CPU usage is
very low, but it doesn't seems to be IO intensive.
My discs read about 20MB/sec, when using "hdparam -t /dev/sda" it gets
about 100MB/sec.

What is the reason of so slow loading?

Current shard configuration:
{
"cluster_name" : "MY_CLUSTER",
"status" : "green",
"timed_out" : false,
"number_of_nodes" : 10,
"number_of_data_nodes" : 10,
"active_primary_shards" : 10,
"active_shards" : 30,
"relocating_shards" : 0,
"initializing_shards" : 0,
"unassigned_shards" : 0

}

kimchy · November 8, 2011, 3:01am

You can have a look at the CheckIndex class in Lucene to see how its
implemented. Regarding the number of shards started in parallel on a node,
you can control that. By default, it will allow for 4 primary shards to be
started (cluster.routing.allocation.node_initial_primaries_recoveries
setting) in parallel on a node, and 2 replica shards
(cluster.routing.allocation.node_concurrent_recoveries setting).

On Fri, Nov 4, 2011 at 10:20 AM, Vadim Punski vpunski@gmail.com wrote:

Yes, you've mentioned it before. Now I'd like to understand the actual
reason of slow read rate (from IO perspective), not the "slow process". By
reason I mean for example non sequential file read requiered by Lucene
index check process. Anyway, I'd like to understand the real reason and try
to solve it.

From what I've seen in the code, as all the shards started in parallel,
their "check on startup" process creates race condition on hardware storage
(disc).

Am I missing something.

Thanks
On Nov 3, 2011 7:15 PM, "Shay Banon" kimchy@gmail.com wrote:

I mentioned before that this is a slow process. It needs to check and
read quite a lot of data from the index.

On Thu, Nov 3, 2011 at 11:01 AM, vadim vpunski@gmail.com wrote:

I tried to use index.shard.check_on_startup in order to check the
consistency of shard.
For unknown reason, this process is very slow.
From short investigation of system resources, I see that CPU usage is
very low, but it doesn't seems to be IO intensive.
My discs read about 20MB/sec, when using "hdparam -t /dev/sda" it gets
about 100MB/sec.

What is the reason of so slow loading?

Current shard configuration:
{
"cluster_name" : "MY_CLUSTER",
"status" : "green",
"timed_out" : false,
"number_of_nodes" : 10,
"number_of_data_nodes" : 10,
"active_primary_shards" : 10,
"active_shards" : 30,
"relocating_shards" : 0,
"initializing_shards" : 0,
"unassigned_shards" : 0

}

Topic		Replies	Views
Shard Initialization slow down Elasticsearch	11	4812	July 6, 2017
Slow indexing speed on just one index Elasticsearch	19	620	November 14, 2024
More shards (in same node) makes indexing much slower Elasticsearch	9	1064	July 6, 2017
Slow startup (replica recovery in logs) Elasticsearch	11	1843	July 6, 2017
Slow cluster startup (again) Elasticsearch	5	3179	July 6, 2017

Shard check on startup is slow

Related topics