Elastic Flush/fetch latency too high

Hello,

from zabbix monitoring im getting a lot of notifications fetch/flush latency is too high.

Are values within a margin or is it problem. In last 12 hours avg was 5000 ms max 34000ms

Would changing durability to async fix the issue? "index.translog.durability": "async"

my current settings

[elasticsearch@elasticsearch-master-0 ~]$ curl -k -u "${ELASTIC_USERNAME}:${ELASTIC_PASSWORD}" \

name flush.total flush.total_time
elasticsearch-master-0 2692 3.9h
elasticsearch-master-1 2699 4.2h
elasticsearch-master-2 2747 4.5h

name heap.percent heap.current heap.max ram.current ram.max
elasticsearch-master-0 67 5.4gb 8gb 11.9gb 12gb
elasticsearch-master-1 67 5.4gb 8gb 11.9gb 12gb
elasticsearch-master-2 46 3.7gb 8gb 11.9gb 12gb

name segments.count segments.memory merges.current
elasticsearch-master-0 624 28.9mb 2
elasticsearch-master-1 547 24.5mb 1
elasticsearch-master-2 579 25.6mb 1

[elasticsearch@elasticsearch-master-0 ~]$ curl -k -u "${ELASTIC_USERNAME}:${ELASTIC_PASSWORD}" \

"settings?include_defaults=true&pretty" | \
grep -A 5 "translog"
% Total % Received % Xferd Average Speed Time Time Time Current
Dload Upload Total Spent Left Speed
0 0 0 0 0 0 0 0 -::- -::- -::- 0 "translog" : {
"generation_threshold_size" : "64mb",
"flush_threshold_size" : "512mb",
"sync_interval" : "5s",
"retention" : {
"size" : "512MB",

image: **elasticsearch:7.10.1

PVC** Size

4096 GiB

Storage type

Standard SSD LRS

      resources:

        limits:

          cpu: '2'

          memory: 12Gi

        requests:

          cpu: 100m

          memory: 4Gi

Thanks for help

Note that this version is very old and has been EOL a very long time. I would recommend that you upgrade, ideally to the latest version.

It sounds like your storage may be too slow for your use case. I would recommend switching to faster storage and see if that resolves the issue.

It may have an impact but I do not use this setting as it reduces resiliency and does not solve the core problem, which potentially is slow storage.

Hi,

Thanks for reply

Standard SSD have Max throughput 750 MB/s IOP 6,000
Would Premiumv2 with 20k iops be enough?

Do you have monitoring of IOPS and disk I/O inplace? Id so, do you see any correlation with the times you are having issues?

From azure monitoring
Last 4 h disk read operations avg 82.95s max 520/s

I do not use Azure a lot but it seems IOPS varies by disk size. What is the size of the disk you are using?

Also no real clue on Azure, but the little asterisk on the 6000 IOPS line says:

* Only applies to disks with performance plus enabled.

Otherwise it’s 500 IOPS per disk (base) for all disk sizes according to that chart. With also pretty limited throughput.

1 Like