Hi.
I have two ES clusters 7.7.1.
Leader cluster include 1 index with 2 primary and 2 replica shards. (total 6)
I indexed few documents (the same doc). I see that those documents indexed to the one primary shard, meaning that second primary is empty.
I configured and started replication, using /_all/ccr/stats api I see that I have two elements in shards array. Example below:
{
"indices":[
{
"index":"MY_INDEX1",
"shards":[
{
"remote_cluster":"leader_cluster",
"leader_index":"MY_INDEX1",
"follower_index":"MY_INDEX1",
"shard_id":0,
"leader_global_checkpoint":3,
"leader_max_seq_no":3,
............
},
{
"remote_cluster":"leader_cluster",
"leader_index":"MY_INDEX1",
"follower_index":"MY_INDEX1",
"shard_id":1,
"leader_global_checkpoint": -1,
"leader_max_seq_no": -1,
"follower_global_checkpoint": -1,
"follower_max_seq_no": -1,
"last_requested_seq_no": -1,,
..........
}
]
}
}
My questions:
-
Replication is done at shard level, as far as I understood each shard replicates data from respective shard with the same id from the leader cluster. Another thing that I understand that follower can copy from both primary or replica shards of the leader (ID's of primary and replica shards are the same). Does it mean that number of elements in shards array (that I receive when I am getting statistics of the follower) is the number of primary shards on leader? In other words, replication works only for primary replicas, and later on it replicates it locally on follower? Is it correct understanding ???
-
In case if second primary shard on leader is empty (there is no docs there), follower statistics show negative values for some metrics, for example
"leader_global_checkpoint": -1,
"leader_max_seq_no": -1,
"follower_global_checkpoint": -1,
"follower_max_seq_no": -1,
"last_requested_seq_no": -1,
If we are talking about negative values, does it always mean that there are no data in primary shard? Or might be negative values is because of some issue while replicating???
I opened ticked, describing such possible issue.
- I would like to enhance Prometheus exported in order to show to user if there is some replication lag presented, that is why I am asking above questions in order to understand whether I can SUM results of the metrics across all shards of the specific index?
If negative values might be interpreted as a "nothing to copy" it might work, if it is not - don't sure how I could show the monitoring information to the user per index but not per shard.
Thanks.