Transaction Log when making new replica

haldrich · December 23, 2015, 4:20pm

Our system requires that we reindex potentially 1m+ records several times per day.
To improve indexing time, we set the replicas to 0 while indexing... then, after the indexing is done, we call a flush and then set the replicas to 1.

When the new replica is created, it causes Marvel to show the index in stage "TRANSLOG" from one node to another. This lasts for some time, often over an hour.. causing the cluster to be yellow during the process.

I'm confused as to why the replica needs to play back the transaction log to be generated... the index hasn't changed when the replica is made. (it hasn't even been made an active index in our production system at the time the replica is added).

Is this the normal process for creating a replica? If so, is there some more efficient process that can be used?

Our cluster is basic: 4 proc VMs w/ 12GB RAM (6 allocated to JVM) and SAN storage (so it isn't as fast as SSDs or local 15k storage ~100MB/s throughput ). However, we can't do much about the setup we have.

Thanks!
Heath

warkolm · December 23, 2015, 8:55pm

That's not the reason why it's yellow, it's yellow because there are replica shards in a non-STARTED state.

When we index a document it gets sent to the primary shard, then the entire document is then sent to the replica shard and reindexed from scratch. We do not simply send the indexed outcome from the primary to the replica.
When you add a new replica, we create that using the translog, as it holds the complete action that was done on the document.

haldrich · December 25, 2015, 12:25am

Hi Mark

Thanks for the info.

So to confirm, what you're saying is that this is a normal process to be expected when adding a replica to an existing index, and there isn't any way to make it more efficient?

It is confusing that we have seen a better result by simply indexing with a refresh of -1 but a replica count of 1.

This adds about 10% to the overall indexing time but avoids the translog state and the cluster more quickly goes green, completing the replication.

I don't understand why the translog state of adding a replica takes longer than the initial indexing.
We noted about 1mm records indexed in 18 minutes (bulk) but the similar translog process takes almost 2 hours.

Thanks again for the clarifications.
Heath

nik9000 · December 25, 2015, 2:00am

Unless you are still writing to the index after you make the replica I
wouldn't expect much translog.

So it sounds like a bug. I'd file it on github.

Topic		Replies	Views
What happens when I increase replica count? Elasticsearch	5	5501	April 9, 2020
Details of how transaction log is managed during indexing Elasticsearch	7	2932	July 6, 2017
Why does translog phase during index recovery take so long? Elasticsearch	8	3021	July 5, 2017
How does a replica catch up with primary Elasticsearch	2	831	July 6, 2017
How Translog Work on elastic Elasticsearch	7	760	April 8, 2023

Transaction Log when making new replica

Related topics