Parallelizing cluster state listeners

Based on my understanding the ClusterApplierService uses callClusterStateListeners to invoke listeners serially which slows down the cluster state publication. Since the action performed by the listeners involve invoking a TransportAction on master to queue it to the pending queue which would eventually be processed serially in batches, does it make sense to parallelize listeners

Not really, because some listeners depend for their correctness on not running concurrently with other cluster state application activity. Each listener can of course do some of its work asynchronously if needed.

1 Like

But while listeners are running post the cluster state appliers, the cluster state update isn't supposed to happen right. When you say

concurrently with other cluster state application activity

do you mean cluster state can be modified concurrently by the listeners

No, the listeners cannot modify the cluster state. I do not really understand what you're asking. Could you be more specific?

@DavidTurner I didn't get this part

some listeners depend for their correctness on not running concurrently with other cluster state application activity

Can you provide any specifics on the listener and the activity you are referring to here
Also do you think we should separate out those listeners that are plain observers and can safely execute concurrently so that atleast some of them can be executed in parallel

For instance, IndicesStore is responsible for deleting unneeded shard data, but can run into problems if it's doing so concurrently with IndicesClusterStateService applying a cluster state that reinstates a shard copy that was relocated away from a node and then back again.

@DavidTurner Thanks but here is the confusion
IndicesClusterStateService implements ClusterStateApplier while IndicesStore implements ClusterStateListener and appliers are invoked before the listeners. Is it because the IndicesStore calls the applier service

My concern is we should allow listeners to execute in parallel if they don't run into conflicts so as to make cluster state publication faster

The idea seems unnecessarily complicated. It'd be preferable to investigate the slow listeners and see if they can be sped up or made asynchronous on a case-by-case basis.

2 Likes

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.