Facing issue with occasional slow queries on 5 node ES cluster

I have a 5 node ES cluster and it's performing well on an average but sometime 0.5% of the queries are taking 1 sec to respond. I have tried all the possible solutions/suggestions, not sure what is causing issue.
Overview of cluster:
3 Master with 4gb allocated heap out of 8 gb RAM
2 data node with 4 gb allocated heap out of 64 gb RAM and 70 % of heap is utilized, should i consider increasing the heap size ( Average throughput is 15000 tps).

On all the master OS mem is 94 % used.

Please suggest what should i do to handle 15000 tps with no slow queries.

Welcome to our community! :smiley:

That'd be a start.

Otherwise what does hot threads show at that time? Or slow log? Do you have Monitoring enabled? What version are you on?

Thanks mark for quick response.
Actually we have set threshold for slowlog as 5 ms as we want max response time to be 5 ms for each queries. But 90% of the queries are below threshold and few queries have been logged in slow log. Here some 0.5% of the queries are taking longer time (1 sec). we are using 7.10 version.
One more observation i have is on each master node os memory is 96% used so, it it fine or we need to allocate more ?

Is that system memory or heap?

it's RAM

So system, in that case it's not a major concern.
If your heap on the data nodes is 4 gig and you have 64 gig, then increasing heap would be worth doing.

ok.. I will try that ..Thanks

What is the full output of the cat nodes API and the cluster stats API?

Christian and Mark, should i consider putting more heap for master as well , i have kept it 4 gig thinking master does n't do much compute operations.
What kind of machine instance we should be using for master and data nodes for ex. General purpose machine, m/m optimized . CPU optimized .. Please suggest

Are they master only nodes?

Yes..they are dedicated master

Then that should be fine, master nodes don't need a lot.