I'm experiencing an es on k8s performance issue that confuses me, and this is my test environment. First, I have a k8s cluster with 4 nodes, of which 1 master and 3 nodes:
NAME         STATUS   ROLES               AGE    VERSION
172.24.5.3   Ready    master,monitoring   234d   v1.13.5
172.24.5.4   Ready    monitoring,node     234d   v1.13.5
172.24.5.5   Ready    node                234d   v1.13.5
172.24.5.7   Ready    node                234d   v1.13.5
You can see my k8s version is 1.13.5, then i use eck (https://github.com/elastic/cloud-on-k8s, then run kubectl apply -f https://download.elastic.co/downloads/eck/1.0.0-beta1/all-in-one.yaml) to get an es cluster with 5 nodes:
NAME                               READY   STATUS    RESTARTS   AGE   IP              NODE         NOMINATED NODE   READINESS GATES
elasticsearch-sample-es-client-0   1/1     Running   0          18m   10.16.33.84     172.24.5.7   <none>           <none>
elasticsearch-sample-es-data-0     1/1     Running   0          18m   10.16.33.79     172.24.5.7   <none>           <none>
elasticsearch-sample-es-data-1     1/1     Running   0          18m   10.16.215.184   172.24.5.5   <none>           <none>
elasticsearch-sample-es-data-2     1/1     Running   0          18m   10.16.184.199   172.24.5.4   <none>           <none>
elasticsearch-sample-es-master-0   1/1     Running   0          18m   10.16.215.181   172.24.5.5   <none>           <none>
And in order to ensure the consistency of the test, I will ensure that the data nodes are distributed on three k8s nodes. The cr file of this es cluster is as follow:
apiVersion: elasticsearch.k8s.elastic.co/v1beta1
kind: Elasticsearch
metadata:
  name: elasticsearch-sample
  namespace: nes-elasticsearch
spec:
  version: 6.8.4
  http:
    tls:
      selfSignedCertificate:
        disabled: true
  nodeSets:
  - name: master
    config:
      node.master: true
      node.data: false
      node.ingest: false
    podTemplate:
      spec:
        initContainers:
        - name: sysctl
          securityContext:
            privileged: true
          command: ['sh', '-c', 'sysctl -w vm.max_map_count=262144']
        containers:
        - name: elasticsearch
          resources:
            requests:
              memory: 16Gi
              cpu: 8
            limits:
              memory: 16Gi
              cpu: 8
          env:
          - name: ES_JAVA_OPTS
            value: "-Xms4g -Xmx4g"
    count: 1
    volumeClaimTemplates:
    - metadata:
        name: elasticsearch-data
      spec:
        accessModes:
        - ReadWriteOnce
        resources:
          requests:
            storage: 32Gi
  - name: client
    config:
      node.master: false
      node.data: false
      node.ingest: false
    podTemplate:
      spec:
        initContainers:
        - name: sysctl
          securityContext:
            privileged: true
          command: ['sh', '-c', 'sysctl -w vm.max_map_count=262144']
        containers:
        - name: elasticsearch
          resources:
            requests:
              memory: 16Gi
              cpu: 8
            limits:
              memory: 16Gi
              cpu: 8
          env:
          - name: ES_JAVA_OPTS
            value: "-Xms4g -Xmx4g"
    count: 1
    volumeClaimTemplates:
    - metadata:
        name: elasticsearch-data
      spec:
        accessModes:
        - ReadWriteOnce
        resources:
          requests:
            storage: 32Gi
  - name: data
    config:
      node.master: false
      node.data: true
      node.ingest: false
    podTemplate:
      spec:
        initContainers:
        - name: sysctl
          securityContext:
            privileged: true
          command: ['sh', '-c', 'sysctl -w vm.max_map_count=262144']
        containers:
        - name: elasticsearch
          resources:
            requests:
              memory: 16Gi
              cpu: 8
            limits:
              memory: 16Gi
              cpu: 8
          env:
          - name: ES_JAVA_OPTS
            value: "-Xms4g -Xmx4g"
    count: 3
    volumeClaimTemplates:
    - metadata:
        name: elasticsearch-data
      spec:
        accessModes:
        - ReadWriteOnce
        resources:
          requests:
            storage: 32Gi
You can see that in the above cr, the memory limit set by the memory request is the same, both are 16g. This is a very important parameter. This test is for memory request and memory limit. This is very important. and the pv file is as follow:
apiVersion: v1
kind: PersistentVolume
metadata:
  name: es-local-pv3
spec:
  capacity:
    storage: 32Gi
  accessModes:
  - ReadWriteOnce
  persistentVolumeReclaimPolicy: Retain
  local:
    path: /home/data/eck-test1
  nodeAffinity:
    required:
      nodeSelectorTerms:
      - matchExpressions:
        - key: kubernetes.io/hostname
          operator: In
          values:
          - 172.24.5.4
---
apiVersion: v1
kind: PersistentVolume
metadata:
  name: es-local-pv1
spec:
  capacity:
    storage: 32Gi
  accessModes:
  - ReadWriteOnce
  persistentVolumeReclaimPolicy: Retain
  local:
    path: /home/data/eck-test1
  nodeAffinity:
    required:
      nodeSelectorTerms:
      - matchExpressions:
        - key: kubernetes.io/hostname
          operator: In
          values:
          - 172.24.5.5
---
apiVersion: v1
kind: PersistentVolume
metadata:
  name: es-local-pv2
spec:
  capacity:
    storage: 32Gi
  accessModes:
  - ReadWriteOnce
  persistentVolumeReclaimPolicy: Retain
  local:
    path: /home/data/eck-test1
  nodeAffinity:
    required:
      nodeSelectorTerms:
      - matchExpressions:
        - key: kubernetes.io/hostname
          operator: In
          values:
          - 172.24.5.7
---
apiVersion: v1
kind: PersistentVolume
metadata:
  name: es-local-pv7
spec:
  capacity:
    storage: 32Gi
  accessModes:
  - ReadWriteOnce
  persistentVolumeReclaimPolicy: Retain
  local:
    path: /home/data/eck-test2
  nodeAffinity:
    required:
      nodeSelectorTerms:
      - matchExpressions:
        - key: kubernetes.io/hostname
          operator: In
          values:
          - 172.24.5.5
---
apiVersion: v1
kind: PersistentVolume
metadata:
  name: es-local-pv6
spec:
  capacity:
    storage: 32Gi
  accessModes:
  - ReadWriteOnce
  persistentVolumeReclaimPolicy: Retain
  local:
    path: /home/data/eck-test2
  nodeAffinity:
    required:
      nodeSelectorTerms:
      - matchExpressions:
        - key: kubernetes.io/hostname
          operator: In
          values:
          - 172.24.5.7
With the a bove file, you can quickly create an es cluster like mine. Then i use es-rally (https://github.com/elastic/rally) to test this cluster, i use the http_logs (https://github.com/elastic/rally-tracks/tree/master/http_logs) dataset and only test with the operation of index-append:
{
      "name": "index-append",
      "operation-type": "bulk",
      "bulk-size": {{bulk_size | default(5000)}},
      "ingest-percentage": {{ingest_percentage | default(100)}},
      "corpora": "http_logs"
}
The challenges is:
"schedule": [
        {
          "operation": "delete-index"
        },
        {
          "operation": {
            "operation-type": "create-index",
            "settings": {{index_settings | default({}) | tojson}}
          }
        },
        {
          "name": "check-cluster-health",
          "operation": {
            "operation-type": "cluster-health",
            "index": "logs-*",
            "request-params": {
              "wait_for_status": "{{cluster_health | default('green')}}",
              "wait_for_no_relocating_shards": "true"
            }
          }
        },
        {
          "operation": "index-append",
          "warmup-time-period": 240,
          "clients": {{bulk_indexing_clients | default(30)}}
        }
      ]
As you can see, I simplified the test by default, only testing the performance of the index. Also, I placed esrally inside the docker container at 172.24.5.3. Get the password of the es cluster first:
kubectl get secret elasticsearch-sample-es-elastic-user -n nes-elasticsearch -o=jsonpath='{.data.elastic}' | base64 --decode
then run the rally:
esrally --pipeline=benchmark-only --target-hosts=192.168.12.3:9200 --track=/rally/.rally/benchmarks/tracks/http_logs --report-format=csv --report-file=result.csv --challenge=append-no-conflicts --client-options="use_ssl:false,verify_certs:false,basic_auth_user:'elastic',basic_auth_password:'$PASSWORD'"
192.168.12.3 is the es http service's cluster ip. I got the test result as follow:
Metric,Task,Value,Unit
Cumulative indexing time of primary shards,,303.7706,min
Min cumulative indexing time across primary shards,,0.9421166666666667,min
Median cumulative indexing time across primary shards,,3.1037833333333333,min
Max cumulative indexing time across primary shards,,69.81788333333334,min
Cumulative indexing throttle time of primary shards,,0,min
Min cumulative indexing throttle time across primary shards,,0,min
Median cumulative indexing throttle time across primary shards,,0,min
Max cumulative indexing throttle time across primary shards,,0,min
Cumulative merge time of primary shards,,139.40831666666665,min
Cumulative merge count of primary shards,,3138,
Min cumulative merge time across primary shards,,0.09126666666666666,min
Median cumulative merge time across primary shards,,0.5575166666666667,min
Max cumulative merge time across primary shards,,26.99235,min
Cumulative merge throttle time of primary shards,,64.86913333333334,min
Min cumulative merge throttle time across primary shards,,0,min
Median cumulative merge throttle time across primary shards,,0.0576,min
Max cumulative merge throttle time across primary shards,,14.664250000000001,min
Cumulative refresh time of primary shards,,15.429016666666666,min
Cumulative refresh count of primary shards,,6023,
Min cumulative refresh time across primary shards,,0.06673333333333333,min
Median cumulative refresh time across primary shards,,0.14036666666666667,min
Max cumulative refresh time across primary shards,,2.721033333333333,min
Cumulative flush time of primary shards,,0.79705,min
Cumulative flush count of primary shards,,115,
Min cumulative flush time across primary shards,,0.00016666666666666666,min
Median cumulative flush time across primary shards,,0.00036666666666666667,min
Max cumulative flush time across primary shards,,0.24375,min
Total Young Gen GC,,236.31,s
Total Old Gen GC,,2.958,s
Store size,,22.122190072201192,GB
Translog size,,14.817421832121909,GB
Heap used for segments,,94.57985973358154,MB
Heap used for doc values,,0.1043548583984375,MB
Heap used for terms,,81.22084045410156,MB
Heap used for norms,,0.036376953125,MB
Heap used for points,,5.796648979187012,MB
Heap used for stored fields,,7.421638488769531,MB
Segment count,,596,
Min Throughput,index-append,170934.94,docs/s
Median Throughput,index-append,175795.68,docs/s
Max Throughput,index-append,182926.54,docs/s
50th percentile latency,index-append,852.5300654582679,ms
90th percentile latency,index-append,1073.6245419830084,ms
99th percentile latency,index-append,1436.844245232641,ms
99.9th percentile latency,index-append,3084.4296338940176,ms
99.99th percentile latency,index-append,3681.6509089201218,ms
100th percentile latency,index-append,4000.8082520216703,ms
50th percentile service time,index-append,852.5300654582679,ms
90th percentile service time,index-append,1073.6245419830084,ms
99th percentile service time,index-append,1436.844245232641,ms
99.9th percentile service time,index-append,3084.4296338940176,ms
99.99th percentile service time,index-append,3681.6509089201218,ms
100th percentile service time,index-append,4000.8082520216703,ms
error rate,index-append,0.00,%








