A major issue with cluster state handling and persistent tasks cancellation

sherman81 · April 24, 2026, 11:06am

We are using ES 9.3.0.

Our cluster has many data streams and indices. The cluster state is ~350 MB (compressed on disk) under normal conditions.

I mistakenly scheduled a large number of downsampling tasks for historical data, which created ~2,200 persistent tasks.

After that, I canceled the tasks and removed the ILM policy from the data stream. However, logs show that each canceled task triggers a full cluster state update (why? I just canceled the tasks).

Each update currently takes ~20 seconds, making the cluster effectively unusable.

[2026-04-24T13:55:53,612][WARN ][o.e.g.PersistedClusterStateService] [hssf43] writing cluster state took [21473ms] which is above the warn threshold of [10s]; [wrote] global metadata, wrote [0] new mappings, removed [0] mappings and skipped [1514] unchanged mappings, wrote metadata for [0] new indices and [0] existing indices, removed metadata for [0] indices and skipped [2219] unchanged indices

[2026-04-24T13:56:44,213][WARN ][o.e.g.PersistedClusterStateService] [hssf43] writing cluster state took [21408ms] which is above the warn threshold of [10s]; [wrote] global metadata, wrote [0] new mappings, removed [0] mappings and skipped [1514] unchanged mappings, wrote metadata for [0] new indices and [0] existing indices, removed metadata for [0] indices and skipped [2219] unchanged indices

[2026-04-24T13:57:35,750][WARN ][o.e.g.PersistedClusterStateService] [hssf43] writing cluster state took [21693ms] which is above the warn threshold of [10s]; [wrote] global metadata, wrote [0] new mappings, removed [0] mappings and skipped [1514] unchanged mappings, wrote metadata for [0] new indices and [0] existing indices, removed metadata for [0] indices and skipped [2219] unchanged indices

[2026-04-24T13:58:26,707][WARN ][o.e.g.PersistedClusterStateService] [hssf43] writing cluster state took [21609ms] which is above the warn threshold of [10s]; [wrote] global metadata, wrote [0] new mappings, removed [0] mappings and skipped [1514] unchanged mappings, wrote metadata for [0] new indices and [0] existing indices, removed metadata for [0] indices and skipped [2219] unchanged indices

[2026-04-24T13:59:16,408][WARN ][o.e.g.PersistedClusterStateService] [hssf43] writing cluster state took [21609ms] which is above the warn threshold of [10s]; [wrote] global metadata, wrote [0] new mappings, removed [0] mappings and skipped [1514] unchanged mappings, wrote metadata for [0] new indices and [0] existing indices, removed metadata for [0] indices and skipped [2219] unchanged indices

[2026-04-24T14:00:06,732][WARN ][o.e.g.PersistedClusterStateService] [hssf43] writing cluster state took [21610ms] which is above the warn threshold of [10s]; [wrote] global metadata, wrote [0] new mappings, removed [0] mappings and skipped [1514] unchanged mappings, wrote metadata for [0] new indices and [0] existing indices, removed metadata for [0] indices and skipped [2219] unchanged indices

[2026-04-24T14:00:57,081][WARN ][o.e.g.PersistedClusterStateService] [hssf43] writing cluster state took [21810ms] which is above the warn threshold of [10s]; [wrote] global metadata, wrote [0] new mappings, removed [0] mappings and skipped [1514] unchanged mappings, wrote metadata for [0] new indices and [0] existing indices, removed metadata for [0] indices and skipped [2219] unchanged indices

[2026-04-24T14:01:47,891][WARN ][o.e.g.PersistedClusterStateService] [hssf43] writing cluster state took [21611ms] which is above the warn threshold of [10s]; [wrote] global metadata, wrote [0] new mappings, removed [0] mappings and skipped [1514] unchanged mappings, wrote metadata for [0] new indices and [0] existing indices, removed metadata for [0] indices and skipped [2219] unchanged indices

[2026-04-24T14:02:38,416][WARN ][o.e.g.PersistedClusterStateService] [hssf43] writing cluster state took [22010ms] which is above the warn threshold of [10s]; [wrote] global metadata, wrote [0] new mappings, removed [0] mappings and skipped [1514] unchanged mappings, wrote metadata for [0] new indices and [0] existing indices, removed metadata for [0] indices and skipped [2219] unchanged indices

Is there a way to speed up this process or mitigate the impact?

p.s. Currently there's no write activity from users.

DavidTurner · April 24, 2026, 11:40am

Persistent tasks are recorded in the cluster state, so cancelling a persistent task requires a cluster state update. These are in fact not "full" cluster state updates, they're incremental (see skipped [1514] unchanged mappings and skipped [2219] unchanged indices) but that doesn't really help you.

I don't have any other suggestions beyond waiting for these cancellations to complete. At the current rate I guess it'll be done in 12h or so, although as the number of tasks decreases they should get faster.

sherman81 · April 24, 2026, 11:45am

Hi, David!

Thank you for answering.

If this is supposed to be a delta update, why does each cluster state update result in a new segment being written to disk that is about 90–95% of the total state size?

And one more clarification question: when all tasks have disappeared from the Tasks API and there is no ILM policy that created them, can we be sure they won’t be restored after a cluster restart?

DavidTurner · April 24, 2026, 11:52am

It's the [wrote] global metadata bit - this is where persistent tasks are stored, and it's not subdivided so we have to rewrite the whole thing on each change.

You'd need to watch GET _cluster/state to see the actual state of these persistent tasks.

sherman81 · April 24, 2026, 3:04pm

You'd need to watch GET _cluster/state

Not sure I understand the situation. I canceled the tasks, and they disappeared from the _tasks API results, but they are still present in the cluster state.

less cluster_state_full.json | grep rollup-shard -c
2234

If it matters, ILM is stopped globally.

DavidTurner · April 24, 2026, 3:17pm

There are (at least?) three different things called "tasks" in Elasticsearch. The ones in GET _tasks are things that are actively running in the system at the time. The ones in the cluster state are persistent tasks which will normally correspond to an active task but may not be assigned to a node at the time. Then there's GET cluster/pending_tasks which are cluster state updates. It's confusing indeed.

sherman81 · April 24, 2026, 3:25pm

According to pending tasks API, I see my not yet canceled tasks as:

"source": "update project [default] task state [downsample-downsample-5m-.ds-metrics-otelcol.v1-devops-2025.10.12-000106-1-5m]",

and tasks which are already updated (after cancel) as:

"source": "finish project [default] persistent task [downsample-downsample-5m-.ds-metrics-otelcol.v1-devops-2025.12.22-000276-2-5m] (success)",

So, do I need to do anything additional to completely remove it from the cluster state? I’m concerned that it might be restored later.

DavidTurner · April 27, 2026, 8:15am

Sorry if I'm not following, but what do you mean by "restored later"? These tasks are triggered by some active process, e.g. manually or by ILM. Once the existing ones have all finished I would not expect any new ones to start.

Topic		Replies	Views
This task can cancel? Elasticsearch	4	1621	September 12, 2017
Elasticsearch Task API does not cancel tasks Elasticsearch	3	4900	January 2, 2019
Clear cluster pending tasks Elasticsearch	2	8262	September 10, 2015
Elasticsearch has accumulated a lot of pending tasks, and stop indexing Elasticsearch	12	1284	September 6, 2023
Shard recovery blocks updates to cluster state? Elasticsearch	0	899	August 3, 2015

A major issue with cluster state handling and persistent tasks cancellation

Related topics