High CPU Usage in Elastic

High I have 2 nodes in Elasticsearch cluster. 1 master and 1 data node. The master node is also a data node

I have 2 logstash instances running and one filebeat on separate machines.

The load is not too much just a modest 100-200 req per sec.

But I am facing some performance issues. On Elasticsearch the CPU usage is too high 150-250%. If I try to get the hot threads then I get the following output

::: {node124}{lo4V14MESv2jDG5oK3bzwQ}{v4osXlRkQC2WDONT1fpNog}{-.-.-.-}{-.-.-.-:9300}{cdfhilrstw}{ml.machine_memory=66995937280, xpack.installed=true, transform.node=true, ml.max_open_jobs=512, ml.max_jvm_size=1073741824}
   Hot threads at 2022-06-29T07:40:52.008Z, interval=500ms, busiestThreads=3, ignoreIdleThreads=true:

   100.0% [cpu=0.6%, other=99.4%] (500ms out of 500ms) cpu usage by thread 'elasticsearch[node124][transport_worker][T#17]'
     3/10 snapshots sharing following 3 elements
       io.netty.util.concurrent.SingleThreadEventExecutor$4.run(SingleThreadEventExecutor.java:986)
       io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74)
       java.base@18/java.lang.Thread.run(Thread.java:833)

::: {node201}{I5WCg6avQV6S_ebq-z3DCA}{4nMCyTuPSdWDUisMvIUpkQ}{0.0.0.0}{0.0.0.0:9300}{cdfhilmrstw}{ml.machine_memory=67350491136, ml.max_open_jobs=512, xpack.installed=true, ml.max_jvm_size=1073741824, transform.node=true}
   Hot threads at 2022-06-29T07:29:45.081Z, interval=500ms, busiestThreads=3, ignoreIdleThreads=true:

   100.0% [cpu=1.6%, other=98.4%] (500ms out of 500ms) cpu usage by thread 'elasticsearch[node201][transport_worker][T#3]'
     2/10 snapshots sharing following 3 elements
       io.netty.util.concurrent.SingleThreadEventExecutor$4.run(SingleThreadEventExecutor.java:986)
       io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74)
       java.base@18/java.lang.Thread.run(Thread.java:833)

What hardware are you running Elasticsearch on? What is the configuration of the cluster?

1 Like

My System Hardware info is this

H/W path         Device      Class          Description
=======================================================
                             system         SYS-1029U-TR4T (To be filled by O.E.M.)
/0                           bus            X11DPU
/0/0                         memory         64KiB BIOS
/0/1                         memory         64GiB System Memory
/0/1/0                       memory         8GiB DIMM DDR4 Synchronous 2666 MHz (0.4 ns)
/0/1/1                       memory         DIMM [empty]
/0/1/2                       memory         16GiB DIMM DDR4 Synchronous 2666 MHz (0.4 ns)
/0/1/3                       memory         DIMM [empty]
/0/1/4                       memory         DIMM [empty]
/0/1/5                       memory         DIMM [empty]
/0/1/6                       memory         8GiB DIMM DDR4 Synchronous 2666 MHz (0.4 ns)
/0/1/7                       memory         DIMM [empty]
/0/1/8                       memory         DIMM [empty]
/0/1/9                       memory         DIMM [empty]
/0/1/a                       memory         DIMM [empty]
/0/1/b                       memory         DIMM [empty]
/0/1/c                       memory         8GiB DIMM DDR4 Synchronous 2666 MHz (0.4 ns)
/0/1/d                       memory         DIMM [empty]
/0/1/e                       memory         16GiB DIMM DDR4 Synchronous 2666 MHz (0.4 ns)
/0/1/f                       memory         DIMM [empty]
/0/1/10                      memory         DIMM [empty]
/0/1/11                      memory         DIMM [empty]
/0/1/12                      memory         8GiB DIMM DDR4 Synchronous 2666 MHz (0.4 ns)
/0/1/13                      memory         DIMM [empty]
/0/1/14                      memory         DIMM [empty]
/0/1/15                      memory         DIMM [empty]
/0/1/16                      memory         DIMM [empty]
/0/1/17                      memory         DIMM [empty]
/0/3c                        memory         512KiB L1 cache
/0/3d                        memory         8MiB L2 cache
/0/3e                        memory         11MiB L3 cache
/0/3f                        processor      Intel(R) Xeon(R) Silver 4110 CPU @ 2.10GHz
/0/40                        memory         512KiB L1 cache
/0/41                        memory         8MiB L2 cache
/0/42                        memory         11MiB L3 cache
/0/43                        processor      Intel(R) Xeon(R) Silver 4110 CPU @ 2.10GHz
/0/100                       bridge         Sky Lake-E DMI3 Registers
/0/100/4         /dev/fb0    generic        Sky Lake-E CBDMA Registers
/0/100/4.1                   generic        Sky Lake-E CBDMA Registers
/0/100/4.2                   generic        Sky Lake-E CBDMA Registers
/0/100/4.3                   generic        Sky Lake-E CBDMA Registers
/0/100/4.4                   generic        Sky Lake-E CBDMA Registers
/0/100/4.5                   generic        Sky Lake-E CBDMA Registers
/0/100/4.6                   generic        Sky Lake-E CBDMA Registers
/0/100/4.7                   generic        Sky Lake-E CBDMA Registers
/0/100/5                     generic        Sky Lake-E MM/Vt-d Configuration Registers
/0/100/5.2                   generic        Sky Lake-E RAS
/0/100/5.4                   generic        Sky Lake-E IOAPIC
/0/100/8                     generic        Sky Lake-E Ubox Registers
/0/100/8.1                   generic        Sky Lake-E Ubox Registers
/0/100/8.2                   generic        Sky Lake-E Ubox Registers
/0/100/11                    generic        C620 Series Chipset Family MROM 0
/0/100/11.1                  generic        C620 Series Chipset Family MROM 1
/0/100/11.5      scsi0       storage        C620 Series Chipset Family SSATA Controller [AHCI mode]
/0/100/11.5/0    /dev/sda    disk           480GB INTEL SSDSC2KB48
/0/100/11.5/0/1  /dev/sda1   volume         199MiB Windows FAT volume
/0/100/11.5/0/2  /dev/sda2   volume         1023MiB data partition
/0/100/11.5/0/3  /dev/sda3   volume         445GiB LVM Physical Volume
/0/100/11.5/1    /dev/sdb    disk           480GB INTEL SSDSC2KB48
/0/100/11.5/1/1  /dev/sdb1   volume         511KiB boot partition
/0/100/11.5/1/2  /dev/sdb2   volume         447GiB ZFS partition
/0/100/11.5/2    /dev/sdc    disk           480GB INTEL SSDSC2KB48
/0/100/11.5/2/1  /dev/sdc1   volume         199MiB Windows FAT volume
/0/100/11.5/2/2  /dev/sdc2   volume         1023MiB data partition
/0/100/11.5/2/3  /dev/sdc3   volume         130GiB data partition
/0/100/11.5/2/4  /dev/sdc4   volume         15GiB Linux swap volume
/0/100/11.5/2/5  /dev/sdc5   volume         99GiB data partition
/0/100/11.5/2/6  /dev/sdc6   volume         199GiB EFI partition
/0/100/11.5/3    /dev/sdd    disk           480GB INTEL SSDSC2KB48
/0/100/11.5/3/1  /dev/sdd1   volume         1596MiB Linux filesystem partition
/0/100/11.5/3/2  /dev/sdd2   volume         445GiB Linux LVM Physical Volume partition
/0/100/14                    bus            C620 Series Chipset Family USB 3.0 xHCI Controller
/0/100/14/0      usb1        bus            xHCI Host Controller
/0/100/14/0/7                bus            Hub
/0/100/14/0/7/1              input          Keyboard
/0/100/14/1      usb2        bus            xHCI Host Controller
/0/100/14.2                  generic        C620 Series Chipset Family Thermal Subsystem
/0/100/16                    communication  C620 Series Chipset Family MEI Controller #1
/0/100/16.1                  communication  C620 Series Chipset Family MEI Controller #2
/0/100/16.4                  communication  C620 Series Chipset Family MEI Controller #3
/0/100/17        scsi6       storage        C620 Series Chipset Family SATA Controller [AHCI mode]
/0/100/17/0      /dev/sde    disk           480GB INTEL SSDSC2KB48
/0/100/17/1      /dev/sdf    disk           480GB INTEL SSDSC2KB48
/0/100/17/2      /dev/sdg    disk           480GB INTEL SSDSC2KB48
/0/100/17/3      /dev/sdh    disk           480GB INTEL SSDSC2KB48
/0/100/1c                    bridge         C620 Series Chipset Family PCI Express Root Port #1
/0/100/1c.5                  bridge         C620 Series Chipset Family PCI Express Root Port #6
/0/100/1c.5/0                bridge         AST1150 PCI-to-PCI Bridge
/0/100/1c.5/0/0              display        ASPEED Graphics Family
/0/100/1f                    bridge         C621 Series Chipset LPC/eSPI Controller
/0/100/1f.2                  memory         Memory controller
/0/100/1f.4                  bus            C620 Series Chipset Family SMBus
/0/100/1f.5                  bus            C620 Series Chipset Family SPI Controller
/0/101                       bridge         Sky Lake-E PCI Express Root Port A
/0/101/0         eno1        network        Ethernet Controller X710/X557-AT 10GBASE-T
/0/101/0.1       eno2        network        Ethernet Controller X710/X557-AT 10GBASE-T
/0/101/0.2       eno3        network        Ethernet Controller X710/X557-AT 10GBASE-T
/0/101/0.3       eno4        network        Ethernet Controller X710/X557-AT 10GBASE-T
/0/102                       bridge         Sky Lake-E PCI Express Root Port C

@Christian_Dahlqvist My Cluster settings are as follows

{
  "persistent" : {
    "cluster" : {
      "routing" : {
        "rebalance" : {
          "enable" : "primaries"
        },
        "allocation" : {
          "allow_rebalance" : "always",
          "cluster_concurrent_rebalance" : "2"
        }
      }
    },
    "xpack" : {
      "monitoring" : {
        "collection" : {
          "enabled" : "true"
        }
      }
    }
  },
  "transient" : { }
}

and Cluster Health API returns following info

{
  "cluster_name" : "oss_cluster",
  "status" : "green",
  "timed_out" : false,
  "number_of_nodes" : 2,
  "number_of_data_nodes" : 2,
  "active_primary_shards" : 23,
  "active_shards" : 44,
  "relocating_shards" : 0,
  "initializing_shards" : 0,
  "unassigned_shards" : 0,
  "delayed_unassigned_shards" : 0,
  "number_of_pending_tasks" : 0,
  "number_of_in_flight_fetch" : 0,
  "task_max_waiting_in_queue_millis" : 0,
  "active_shards_percent_as_number" : 100.0
}


Also, can you tell me what the transport_worker threads do?

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.