High OS Cpu usage on 7.7.1

Hello,

We recently upgraded our ELK stack from v 7.4.0 to 7.7.1 and now we noticed that the OS CPU is 100% on our hot data nodes which do the indexing. The OS CPU is almost always at 100%, the process CPU is close to 45-50%.

We checked the hot threads and saw 98% of the times it is Lucene Merging Thread. The refresh interval on our indices is 120s, size is 100gb (3 shards).

Is this a bug in 7.7.1? Or are we doing something wrong? This was not the case in previous versions.

Thank you.

1 Like

we are also facing same issue, on indexing nodes

Another piece of information is:
I tried downgrading our cluster with the exact config to 7.4.0, and the CPU looks perfectly fine now.
Which makes me wonder if it is something in 7.7.1 which is maybe reporting incorrect metrics?

We also concluded with same results, seems something off in 7.6 and 7.7 releases

I am wondering if the problem you are experiencing is due to wrong JVM settings after upgrading. See: Elasticsearch 7.8 worse heap management

1 Like

Yeah, any GC going on? If not, this seems odd unless there was merge behavior / scheduling changes (can only limit thread count) , but doubt anything big between 7.4. and 7.7 - you could try to tune merging just to see if that's the issue - and also upgrade to 7.8 to see if helps.

Any change to templates, especially for refresh rate?

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.