Elasticsearch on AKS

Hi,

I'm testing Elasticsearch cluster on Azure AKS environment.
I'm using 5 nodes AKS cluster (Standard E8s v3). Three nodes are dedicated for Elastic cluster. Three nodes data + master. All pods have additional persistent volume (Premium SSD 1 TB) for data.
We are getting lots of logs like:
[INFO ][o.e.m.j.JvmGcMonitorService] [elasticsearch-node-1] [gc][6799] overhead
[WARN ][o.e.m.j.JvmGcMonitorService] [elasticsearch-node-1] [gc][6799] overhead

From time to time we are reaching queue size limit (200).

I used Rally to check Elastic cluster
esrally --pipeline=benchmark-only --target-hosts=elasticsearch:9200 --track=geopoint --challenge=append-fast-with-conflicts

Lap Metric Task Value Unit
All Total Young Gen GC 2245.89 s
All Total Old Gen GC 0.452 s
All Min Throughput index-update 21163.6 docs/s
All Median Throughput index-update 21969.2 docs/s
All Max Throughput index-update 26891.7 docs/s

I created on Azure one VM (E8s v3) to test if this is maybe issue with a sizing of the VM but,
I see huge difference in Max Throughput between three nodes cluster on AKS and one VM.

Lap Metric Task Value Unit
All Total Young Gen GC 78.33 s
All Total Old Gen GC 0.217 s
All Min Throughput index-update 94740.5 docs/s
All Median Throughput index-update 106737 docs/s
All Max Throughput index-update 126790 docs/s

I also tested one node Elastic cluster on AKS.

| All | Total Young Gen GC | | 3424.56 | s |
| All | Total Old Gen GC | | 0.428 | s |
| All | Min Throughput | index-update | 6633.7 | docs/s |
| All | Median Throughput | index-update | 7144.32 | docs/s |
| All | Max Throughput | index-update | 7473.19 | docs/s |

Do you have any experience with Azure AKS.
Maybe I should setup Elastic differently than in VM.

Krzysiek

AKS 3 nodes cluster

Lap Metric Task Value Unit
All Total indexing time 67.0337 min
All Min indexing time per shard 0 min
All Median indexing time per shard 0 min
All Max indexing time per shard 12.9383 min
All Total merge time 17.6462 min
All Min merge time per shard 0 min
All Median merge time per shard 0 min
All Max merge time per shard 4.2327 min
All Total merge throttle time 1.34357 min
All Min merge throttle time per shard 0 min
All Median merge throttle time per shard 0 min
All Max merge throttle time per shard 0.249117 min
All Total refresh time 15.7534 min
All Min refresh time per shard 0 min
All Median refresh time per shard 0 min
All Max refresh time per shard 3.03417 min
All Total flush time 0.0199333 min
All Min flush time per shard 0 min
All Median flush time per shard 0 min
All Max flush time per shard 0.01645 min
All Total Young Gen GC 2245.89 s
All Total Old Gen GC 0.452 s
All Store size 63.3685 GB
All Translog size 4.93038 GB
All Heap used for segments 47.8294 MB
All Heap used for doc values 4.14286 MB
All Heap used for terms 25.2056 MB
All Heap used for norms 0.0022583 MB
All Heap used for points 12.9216 MB
All Heap used for stored fields 5.557 MB
All Segment count 339
All Min Throughput index-update 21163.6 docs/s
All Median Throughput index-update 21969.2 docs/s
All Max Throughput index-update 26891.7 docs/s
All 50th percentile latency index-update 1800.68 ms
All 90th percentile latency index-update 2468.89 ms
All 99th percentile latency index-update 3369.75 ms
All 99.9th percentile latency index-update 4504.29 ms
All 99.99th percentile latency index-update 5358.83 ms
All 100th percentile latency index-update 5416.31 ms
All 50th percentile service time index-update 1800.68 ms
All 90th percentile service time index-update 2468.89 ms
All 99th percentile service time index-update 3369.75 ms
All 99.9th percentile service time index-update 4504.29 ms
All 99.99th percentile service time index-update 5358.83 ms
All 100th percentile service time index-update 5416.31 ms
All error rate index-update 0 %

One VM

Lap Metric Task Value Unit
All Total indexing time 96.1971 min
All Min indexing time per shard 0.000666667 min
All Median indexing time per shard 4.93302 min
All Max indexing time per shard 12.4609 min
All Total merge time 5.79252 min
All Min merge time per shard 0 min
All Median merge time per shard 0.253533 min
All Max merge time per shard 1.04133 min
All Total merge throttle time 0.154983 min
All Min merge throttle time per shard 0 min
All Median merge throttle time per shard 0.00508333 min
All Max merge throttle time per shard 0.05465 min
All Total refresh time 8.19662 min
All Min refresh time per shard 0.00113333 min
All Median refresh time per shard 0.54775 min
All Max refresh time per shard 0.943633 min
All Total flush time 0.000133333 min
All Min flush time per shard 0 min
All Median flush time per shard 0 min
All Max flush time per shard 0.000133333 min
All Total Young Gen GC 78.33 s
All Total Old Gen GC 0.217 s
All Store size 6.32997 GB
All Translog size 6.31995 GB
All Heap used for segments 30.7482 MB
All Heap used for doc values 0.0181198 MB
All Heap used for terms 28.1291 MB
All Heap used for norms 0.0639038 MB
All Heap used for points 0.964798 MB
All Heap used for stored fields 1.57222 MB
All Segment count 192
All Min Throughput index-update 94740.5 docs/s
All Median Throughput index-update 106737 docs/s
All Max Throughput index-update 126790 docs/s
All 50th percentile latency index-update 315.807 ms
All 90th percentile latency index-update 755.748 ms
All 99th percentile latency index-update 2455.04 ms
All 99.9th percentile latency index-update 5273.72 ms
All 100th percentile latency index-update 5694.4 ms
All 50th percentile service time index-update 315.807 ms
All 90th percentile service time index-update 755.748 ms
All 99th percentile service time index-update 2455.04 ms
All 99.9th percentile service time index-update 5273.72 ms
All 100th percentile service time index-update 5694.4 ms
All error rate index-update 0 %

AKS one node cluster
| All | Total indexing time | | 87.0735 | min |
| All | Min indexing time per shard | | 0 | min |
| All | Median indexing time per shard | | 5.83333e-05 | min |
| All | Max indexing time per shard | | 16.0536 | min |
| All | Total merge time | | 52.8072 | min |
| All | Min merge time per shard | | 0 | min |
| All | Median merge time per shard | | 0 | min |
| All | Max merge time per shard | | 9.01507 | min |
| All | Total merge throttle time | | 0.40045 | min |
| All | Min merge throttle time per shard | | 0 | min |
| All | Median merge throttle time per shard | | 0 | min |
| All | Max merge throttle time per shard | | 0.0894833 | min |
| All | Total refresh time | | 19.3158 | min |
| All | Min refresh time per shard | | 0 | min |
| All | Median refresh time per shard | | 0.000133333 | min |
| All | Max refresh time per shard | | 3.26023 | min |
| All | Total flush time | | 0.00588333 | min |
| All | Min flush time per shard | | 0 | min |
| All | Median flush time per shard | | 0 | min |
| All | Max flush time per shard | | 0.00305 | min |
| All | Total Young Gen GC | | 3424.56 | s |
| All | Total Old Gen GC | | 0.428 | s |
| All | Store size | | 25.1363 | GB |
| All | Translog size | | 4.06481 | GB |
| All | Heap used for segments | | 40.8529 | MB |
| All | Heap used for doc values | | 3.37786 | MB |
| All | Heap used for terms | | 23.4827 | MB |
| All | Heap used for norms | | 0.000305176 | MB |
| All | Heap used for points | | 9.73372 | MB |
| All | Heap used for stored fields | | 4.25829 | MB |
| All | Segment count | | 295 | |
| All | Min Throughput | index-update | 6633.7 | docs/s |
| All | Median Throughput | index-update | 7144.32 | docs/s |
| All | Max Throughput | index-update | 7473.19 | docs/s |
| All | 50th percentile latency | index-update | 5206.57 | ms |
| All | 90th percentile latency | index-update | 7434.04 | ms |
| All | 99th percentile latency | index-update | 15225.5 | ms |
| All | 99.9th percentile latency | index-update | 30710.5 | ms |
| All | 99.99th percentile latency | index-update | 41678.9 | ms |
| All | 100th percentile latency | index-update | 41977.5 | ms |
| All | 50th percentile service time | index-update | 5206.57 | ms |
| All | 90th percentile service time | index-update | 7434.04 | ms |
| All | 99th percentile service time | index-update | 15225.5 | ms |
| All | 99.9th percentile service time | index-update | 30710.5 | ms |
| All | 99.99th percentile service time | index-update | 41678.9 | ms |
| All | 100th percentile service time | index-update | 41977.5 | ms |
| All | error rate | index-update | 0 | % |

What is the full output of the cluster stats API?

{
"_nodes" : {
"total" : 3,
"successful" : 3,
"failed" : 0
},
"cluster_name" : "DDL",
"cluster_uuid" : "Z6I3dHv0TPSkOBtcdHFQHw",
"timestamp" : 1548062105067,
"status" : "green",
"indices" : {
"count" : 219,
"shards" : {
"total" : 242,
"primaries" : 226,
"replication" : 0.07079646017699115,
"index" : {
"shards" : {
"min" : 1,
"max" : 6,
"avg" : 1.1050228310502284
},
"primaries" : {
"min" : 1,
"max" : 6,
"avg" : 1.0319634703196348
},
"replication" : {
"min" : 0.0,
"max" : 1.0,
"avg" : 0.0730593607305936
}
}
},
"docs" : {
"count" : 61318280,
"deleted" : 7633014
},
"store" : {
"size" : "15.2gb",
"size_in_bytes" : 16333773175
},
"fielddata" : {
"memory_size" : "121.9kb",
"memory_size_in_bytes" : 124832,
"evictions" : 0
},
"query_cache" : {
"memory_size" : "27.8mb",
"memory_size_in_bytes" : 29243840,
"total_count" : 1041873,
"hit_count" : 225631,
"miss_count" : 816242,
"cache_size" : 10894,
"cache_count" : 44847,
"evictions" : 33953
},
"completion" : {
"size" : "0b",
"size_in_bytes" : 0
},
"segments" : {
"count" : 1623,
"memory" : "48.5mb",
"memory_in_bytes" : 50896554,
"terms_memory" : "31.8mb",
"terms_memory_in_bytes" : 33370510,
"stored_fields_memory" : "4.3mb",
"stored_fields_memory_in_bytes" : 4553352,
"term_vectors_memory" : "0b",
"term_vectors_memory_in_bytes" : 0,
"norms_memory" : "1mb",
"norms_memory_in_bytes" : 1086336,
"points_memory" : "5.2mb",
"points_memory_in_bytes" : 5463968,
"doc_values_memory" : "6.1mb",
"doc_values_memory_in_bytes" : 6422388,
"index_writer_memory" : "0b",
"index_writer_memory_in_bytes" : 0,
"version_map_memory" : "0b",
"version_map_memory_in_bytes" : 0,
"fixed_bit_set" : "2.3mb",
"fixed_bit_set_memory_in_bytes" : 2455960,
"max_unsafe_auto_id_timestamp" : 1548028810206,
"file_sizes" : { }
}
},
"nodes" : {
"count" : {
"total" : 3,
"data" : 3,
"coordinating_only" : 0,
"master" : 3,
"ingest" : 3
},
"versions" : [
"6.5.4"
],
"os" : {
"available_processors" : 3,
"allocated_processors" : 3,
"names" : [
{
"name" : "Linux",
"count" : 3
}
],
"mem" : {
"total" : "188.7gb",
"total_in_bytes" : 202662752256,
"free" : "54.2gb",
"free_in_bytes" : 58225688576,
"used" : "134.5gb",
"used_in_bytes" : 144437063680,
"free_percent" : 29,
"used_percent" : 71
}
},
"process" : {
"cpu" : {
"percent" : 1
},
"open_file_descriptors" : {
"min" : 596,
"max" : 663,
"avg" : 633
}
},
"jvm" : {
"max_uptime" : "1.7d",
"max_uptime_in_millis" : 151748905,
"versions" : [
{
"version" : "11.0.1",
"vm_name" : "OpenJDK 64-Bit Server VM",
"vm_version" : "11.0.1+13",
"vm_vendor" : "Oracle Corporation",
"count" : 3
}
],
"mem" : {
"heap_used" : "36.9gb",
"heap_used_in_bytes" : 39669175176,
"heap_max" : "96gb",
"heap_max_in_bytes" : 103079215104
},
"threads" : 137
},
"fs" : {
"total" : "2.9tb",
"total_in_bytes" : 3243205423104,
"free" : "2.9tb",
"free_in_bytes" : 3223063965696,
"available" : "2.9tb",
"available_in_bytes" : 3223013634048
},
"plugins" : [
{
"name" : "ingest-user-agent",
"version" : "6.5.4",
"elasticsearch_version" : "6.5.4",
"java_version" : "1.8",
"description" : "Ingest processor that extracts information from a user agent",
"classname" : "org.elasticsearch.ingest.useragent.IngestUserAgentPlugin",
"extended_plugins" : ,
"has_native_controller" : false
},
{
"name" : "ingest-geoip",
"version" : "6.5.4",
"elasticsearch_version" : "6.5.4",
"java_version" : "1.8",
"description" : "Ingest processor that uses looksup geo data based on ip adresses using the Maxmind geo database",
"classname" : "org.elasticsearch.ingest.geoip.IngestGeoIpPlugin",
"extended_plugins" : ,
"has_native_controller" : false
}
],
"network_types" : {
"transport_types" : {
"security4" : 3
},
"http_types" : {
"security4" : 3
}
}
}
}

I added resources in the yaml file.

      resources:
        requests:
          memory: 40Gi
          cpu: "7"

Without requested resources Elastic was using only one CPU.

Now I see that Elastic is using 21 CPU on three nodes.
"nodes" : {
"count" : {
"total" : 3,
"data" : 3,
"coordinating_only" : 0,
"master" : 3,
"ingest" : 3
},
"versions" : [
"6.5.4"
],
"os" : {
"available_processors" : 21,
"allocated_processors" : 21,
"names" : [
{
"name" : "Linux",
"count" : 3
}
],

This normal behavior that without resources -> requests -> cpu Elastic will use only one CPU ?

Which YAML file? I do not recognise these settings. Can you share the whole YAML file?

    apiVersion: apps/v1
kind: StatefulSet
metadata:
  name: elasticsearch-node
spec:
  selector:
    matchLabels:
      app: elasticsearch # has to match .spec.template.metadata.labels
  serviceName: "elasticsearch"
  replicas: 3
  template:
    metadata:
      labels:
        app: elasticsearch
        elastic-index-name: elasticsearch
    spec:
      serviceAccountName: elasticsearch-logging
      affinity:
        podAntiAffinity:
          preferredDuringSchedulingIgnoredDuringExecution:
            - weight: 1
              podAffinityTerm:
                topologyKey: "kubernetes.io/hostname"
                labelSelector:
                  matchExpressions:
                    - key: app
                      operator: In
                      values:
                        - elasticsearch
      initContainers:
        - name: chmod-er
          image: acramslhgamsfnds.azurecr.io/busybox:1.27.2
          command: ["sh", "-c", "/bin/chown 1000:1000 /data"]
          volumeMounts:
            - name: data
              mountPath: /data
        - name: init-sysctl
          image: acramslhgamsfnds.azurecr.io/busybox:1.27.2
          command: ["sh", "-c", "sysctl -w vm.max_map_count=262144"]
          securityContext:
            privileged: true
      containers:
        - name: elasticsearch
          image: elasticsearch:6.5.4
          ports:
            - containerPort: 9200
            - containerPort: 9300
          resources:
            requests:
              memory: 40Gi
              cpu: "6"
          env:
            - name: POD_IP
              valueFrom:
                fieldRef:
                  fieldPath: status.podIP
            - name: discovery.zen.ping.unicast.hosts
              value: "elasticsearch"
            - name: cluster.name
              value: "DDL"
            - name: discovery.zen.minimum_master_nodes
              value: "2"
            - name: network.host
              value: "0.0.0.0"
            - name: ES_JAVA_OPTS
              value: "-Xms32766m -Xmx32766m -XX:-UseConcMarkSweepGC -XX:+UseG1GC -XX:MaxGCPauseMillis=300 -XX:G1HeapRegionSize=16m"
            - name: path.data
              value: "/data"
            - name: node.master
              value: "true"
            - name: node.data
              value: "true"
            - name: node.ingest
              value: "true"
            - name: node.name
              value: ${HOSTNAME}
            - name: gateway.recover_after_nodes
              value: "3"
          volumeMounts:
            - name: data
              mountPath: /data
      terminationGracePeriodSeconds: 300
      nodeSelector:
        nodefor: elasticsearch

It's kinda hard to read, since YAML cares about how things are indented and you didn't use the </> button so this vital detail is lost. However, by the looks of it this configures the containers and determines things like the number of CPUs to which they have access, and if you don't ask for more then 1 seems like a reasonable default.

So without defined requests cpu, Elastic will use only one CPU, even if there are 8 available ?

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.