Hi,
I'm testing Elasticsearch cluster on Azure AKS environment.
I'm using 5 nodes AKS cluster (Standard E8s v3). Three nodes are dedicated for Elastic cluster. Three nodes data + master. All pods have additional persistent volume (Premium SSD 1 TB) for data.
We are getting lots of logs like:
[INFO ][o.e.m.j.JvmGcMonitorService] [elasticsearch-node-1] [gc][6799] overhead
[WARN ][o.e.m.j.JvmGcMonitorService] [elasticsearch-node-1] [gc][6799] overhead
From time to time we are reaching queue size limit (200).
I used Rally to check Elastic cluster
esrally --pipeline=benchmark-only --target-hosts=elasticsearch:9200 --track=geopoint --challenge=append-fast-with-conflicts
Lap | Metric | Task | Value | Unit |
---|---|---|---|---|
All | Total Young Gen GC | 2245.89 | s | |
All | Total Old Gen GC | 0.452 | s | |
All | Min Throughput | index-update | 21163.6 | docs/s |
All | Median Throughput | index-update | 21969.2 | docs/s |
All | Max Throughput | index-update | 26891.7 | docs/s |
I created on Azure one VM (E8s v3) to test if this is maybe issue with a sizing of the VM but,
I see huge difference in Max Throughput between three nodes cluster on AKS and one VM.
Lap | Metric | Task | Value | Unit |
---|---|---|---|---|
All | Total Young Gen GC | 78.33 | s | |
All | Total Old Gen GC | 0.217 | s | |
All | Min Throughput | index-update | 94740.5 | docs/s |
All | Median Throughput | index-update | 106737 | docs/s |
All | Max Throughput | index-update | 126790 | docs/s |
I also tested one node Elastic cluster on AKS.
| All | Total Young Gen GC | | 3424.56 | s |
| All | Total Old Gen GC | | 0.428 | s |
| All | Min Throughput | index-update | 6633.7 | docs/s |
| All | Median Throughput | index-update | 7144.32 | docs/s |
| All | Max Throughput | index-update | 7473.19 | docs/s |
Do you have any experience with Azure AKS.
Maybe I should setup Elastic differently than in VM.
Krzysiek