Replica segments not merged upon bulk indexing

colinsurprenant · January 23, 2013, 4:31pm

Hi,

We just created a 2 nodes benchmarking cluster (ES 0.20.1, Ubuntu 10.04,
2.6.32-41) using a single index with 1 shard and 1 replica.

We started bulk indexing, and at about 75M documents, the seconds node with
the replica, busted with a filesystem inodes exhaustion. In fact, there
were about 7700 segments created on the replica node, while there was about
60 segments on the shard node.

All settings we pretty much to default values (in particular for the
merging parameters) except for the refresh_interval to -1.

The question is, how come the replica node ended up with so many segments?
It looks like it did not respect the index merging policy? - I know that
the performance "best practice" for bulk indexing is not to use any replica
then add replicas after bulk. But regardless of this, isn't this huge
segmentation difference between the shard and the replica a problem?

Thanks,
Colin

--

Topic		Replies	Views
Merge policy and segments count Elasticsearch	8	3223	January 8, 2019
ES creating thousands of segments with 1 document each Elasticsearch	5	877	July 5, 2017
Problems caused by creating replica shards too quickly Elasticsearch	1	219	April 9, 2022
Optimizing segment merge settings for high search throughput Elasticsearch	1	610	July 5, 2017
Elasticsearch version 5.4.3 .bulk insert with so many disk reads,but there is no merge operation the same time Elasticsearch	21	749	May 27, 2020

Replica segments not merged upon bulk indexing

Related topics