Elasticsearch multi threading problems and How to improve elasticsearch performance

Hello eveyone, I have 15.1 Million documents and I have such a problem with elasticsearch multi threading, I'm trying to search 46 aggregations(count, cardinality, date_histogram) together and it takes about 20-45 sec. That's far too long for us. I think My expectation is about 2-5 sec. Any help is much appreciated.

Cluster: 4Core, 16GB RAM Server.

I use Python low-level client library called elasticsearch-py

Query and Code Execution Time:
full es query

part of the code, view full code

start = time.time()
client.search(index=['nginx*'], doc_type=None, body=main_query)
print('Standard Search Query Execution Time: ', time.time() - start)

Output: Standard Search Query Execution Time:  19.81995415687561s

start = time.time()
client.msearch(body=msearch_query)
print('Standard Multi Search Query Execution Time: ', time.time() - start)

Output: Standard Multi Search Query Execution Time:  19.14345407485962s

start = time.time()
with ThreadPoolExecutor(50) as ex:
    ex.map(lambda q: client.search(*q), iterable)
print('Search Query in Multi Threading Execution Time: ', time.time() - start)

Output: Search Query in Multi Threading Execution Time:  20.80817937850952s

start = time.time()
jobs = [Thread(target=client.search, args=arg) for arg in iterable]
# start threads
for job in jobs:
    job.start()
# join threads
for job in jobs:
    job.join()
print('Search Query in Standard Multi Threading Execution Time: ', time.time() - start)

Output: Search Query in Standard Multi Threading Execution Time:  21.506370544433594s

elasticsearch configuration -> elasticsearch.yml

path.data: /var/lib/elasticsearch
path.logs: /var/log/elasticsearch

node.master: true
node.data: true

http.host: 0.0.0.0
network.host: 0.0.0.0

script.painless.regex.enabled: true
http.max_initial_line_length: 10K

cloud:
  gce:
      project_id: myproj-es
      zone: europe-west1-b
discovery:
      zen.hosts_provider: gce
      zen.minimum_master_nodes: 2

I was trying to create multi threading and improve the search query performance but without effect, but I have a reason for that

ex. when i try multi threading i see that every new one request needs to wait the previous one finished and why? or how can i solve this problem? how can i configure elasticsearch so that new request don't wait previous one finished?

Python Multi Threading Examples

start = time.time()
jobs = [Thread(target=client.search, args=arg) for arg in iterable]
# start threads
for job in jobs:
    job.start()
# join threads
for job in jobs:
    job.join()
print('Search Query By Standard Multi Threading Execution Time: ', time.time() - start)

Output: Search Query By Standard Multi Threading Execution Time:  21.506370544433594s

start = time.time()
with ThreadPoolExecutor(50) as ex:
    ex.map(lambda q: client.search(*q), iterable)
print('Search Query By Multi Threading Execution Time: ', time.time() - start)

Output: Search Query By Multi Threading Execution Time:  20.80817937850952s

What is the solution? or How can I write an optimal option in this situation?

Is it possible to simultaneously run a lot of aggregation together? As shown in the above example

thanks in advance

Have you identified what is limiting performance in your Elasticsearch cluster? Is disk I/O saturated so you are seeing a lot of iowait? Is CPU saturated and constantly at 100% when you are running the query?

Hello Christian, thanks for answer
i have two clusters, each has 4Core and 16GB RAM and each one has two nodes, in total 4 nodes
i checked cpu load, see the pictures below, when i running the query with multi threading
ex.

with ThreadPoolExecutor(50) as ex:
     ex.map(lambda q: client.search(*q), iterable)

cluster 1

cluster 2

but the RAM is enough and why? or i have wrong jvm.options configuration

# Xms represents the initial size of total heap space
# Xmx represents the maximum size of total heap space
-Xms8g
-Xmx8g

is it right?

Please do not paste images of text as it is very hard to read. From what I can see it looks like all 4 CPU cores are saturated, although it is hard to see how much of this is iowait. What does iostat give?

Okay Christian, thanks once again :slight_smile:

cluster 1 stats, command /_nodes/stats/_all?pretty

 "fs" : {
  "timestamp" : 1535446522546,
  "total" : {
    "total_in_bytes" : 103880232960,
    "free_in_bytes" : 88270123008,
    "available_in_bytes" : 88253345792
  },
  "least_usage_estimate" : {
    "path" : "/var/lib/elasticsearch/nodes/0",
    "total_in_bytes" : 103880232960,
    "available_in_bytes" : 88253349888,
    "used_disk_percent" : 15.04317291819828
  },
  "most_usage_estimate" : {
    "path" : "/var/lib/elasticsearch/nodes/0",
    "total_in_bytes" : 103880232960,
    "available_in_bytes" : 88253349888,
    "used_disk_percent" : 15.04317291819828
  },
  "data" : [
    {
      "path" : "/var/lib/elasticsearch/nodes/0",
      "mount" : "/ (/dev/sda1)",
      "type" : "ext4",
      "total_in_bytes" : 103880232960,
      "free_in_bytes" : 88270123008,
      "available_in_bytes" : 88253345792
    }
  ],
  "io_stats" : {
    "devices" : [
      {
        "device_name" : "sda1",
        "operations" : 316674,
        "read_operations" : 155410,
        "write_operations" : 161264,
        "read_kilobytes" : 4882556,
        "write_kilobytes" : 3050428
      }
    ],
    "total" : {
      "operations" : 316674,
      "read_operations" : 155410,
      "write_operations" : 161264,
      "read_kilobytes" : 4882556,
      "write_kilobytes" : 3050428
    }
  }
},


"fs" : {
  "timestamp" : 1535446521380,
  "total" : {
    "total_in_bytes" : 103880232960,
    "free_in_bytes" : 88530046976,
    "available_in_bytes" : 88513269760
  },
  "data" : [
    {
      "path" : "/var/lib/elasticsearch/nodes/0",
      "mount" : "/ (/dev/sda1)",
      "type" : "ext4",
      "total_in_bytes" : 103880232960,
      "free_in_bytes" : 88530046976,
      "available_in_bytes" : 88513269760
    }
  ],
  "io_stats" : {
    "devices" : [
      {
        "device_name" : "sda1",
        "operations" : 212440,
        "read_operations" : 151995,
        "write_operations" : 60445,
        "read_kilobytes" : 4926748,
        "write_kilobytes" : 822724
      }
    ],
    "total" : {
      "operations" : 212440,
      "read_operations" : 151995,
      "write_operations" : 60445,
      "read_kilobytes" : 4926748,
      "write_kilobytes" : 822724
    }
  }
},

cluster 2 stats, command /_nodes/stats/_all?pretty

"fs" : {
  "timestamp" : 1535447090560,
  "total" : {
    "total_in_bytes" : 103880232960,
    "free_in_bytes" : 88270102528,
    "available_in_bytes" : 88253325312
  },
  "least_usage_estimate" : {
    "path" : "/var/lib/elasticsearch/nodes/0",
    "total_in_bytes" : 103880232960,
    "available_in_bytes" : 88253333504,
    "used_disk_percent" : 15.043188690207572
  },
  "most_usage_estimate" : {
    "path" : "/var/lib/elasticsearch/nodes/0",
    "total_in_bytes" : 103880232960,
    "available_in_bytes" : 88253333504,
    "used_disk_percent" : 15.043188690207572
  },
  "data" : [
    {
      "path" : "/var/lib/elasticsearch/nodes/0",
      "mount" : "/ (/dev/sda1)",
      "type" : "ext4",
      "total_in_bytes" : 103880232960,
      "free_in_bytes" : 88270102528,
      "available_in_bytes" : 88253325312
    }
  ],
  "io_stats" : {
    "devices" : [
      {
        "device_name" : "sda1",
        "operations" : 317064,
        "read_operations" : 155411,
        "write_operations" : 161653,
        "read_kilobytes" : 4882560,
        "write_kilobytes" : 3052984
      }
    ],
    "total" : {
      "operations" : 317064,
      "read_operations" : 155411,
      "write_operations" : 161653,
      "read_kilobytes" : 4882560,
      "write_kilobytes" : 3052984
    }
  }
},


"fs" : {
  "timestamp" : 1535447089779,
  "total" : {
    "total_in_bytes" : 103880232960,
    "free_in_bytes" : 88530034688,
    "available_in_bytes" : 88513257472
  },
  "data" : [
    {
      "path" : "/var/lib/elasticsearch/nodes/0",
      "mount" : "/ (/dev/sda1)",
      "type" : "ext4",
      "total_in_bytes" : 103880232960,
      "free_in_bytes" : 88530034688,
      "available_in_bytes" : 88513257472
    }
  ],
  "io_stats" : {
    "devices" : [
      {
        "device_name" : "sda1",
        "operations" : 212693,
        "read_operations" : 152039,
        "write_operations" : 60654,
        "read_kilobytes" : 4927296,
        "write_kilobytes" : 824376
      }
    ],
    "total" : {
      "operations" : 212693,
      "read_operations" : 152039,
      "write_operations" : 60654,
      "read_kilobytes" : 4927296,
      "write_kilobytes" : 824376
    }
  }
},

I am looking for the output of iostat, not Elasticsearch statistics.

sorry, now i got the results

I used the commands:
$ iostat
$ iostat -d 2 10
$ iostat -x hda hdb 2 10
$ iostat -p sda 2 10

Disks look largely idle, so it seems your bottleneck is CPU. I would therefore recommend scaling up or out the cluster.

I'm sorry, the previous results were useless, because there were no query running
I did update the results, please see it results

That looks like a lot of iowait. What type of storage/disks are you using?

It looks like the disk type is HDD, and what is the solution?

# SSD - 0
# HDD - 1
root@es-group-22xc:/home/gogua# cat /sys/block/sda/queue/rotational
1

# where ROTA means rotational device (1 if true, 0 if false)
root@es-group-22xc:/home/gogua# lsblk -d -o name,rota
NAME  ROTA
loop0    1
loop1    1
loop3    1
loop4    1
loop5    1
sda      1

root@es-group-22xc:/home/gogua# lshw -short -C disk
H/W path      Device      Class      Description
================================================
/0/1/0.1.0    /dev/sda    disk       107GB PersistentDisk

root@logmind-es-group-22xc:/home/gogua# cat /proc/scsi/scsi
Attached devices:
Host: scsi0 Channel: 00 Id: 01 Lun: 00
  Vendor: Google   Model: PersistentDisk   Rev: 1   
  Type:   Direct-Access                    ANSI  SCSI revision: 06

root@es-group-22xc:/home/gogua# hdparm -I /dev/sda

/dev/sda:
SG_IO: bad/missing sense data, sb[]:  70 00 05 00 00 00 00 0a 00 00 00 00 20 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00

ATA device, with non-removable media
Standards:
	Likely used: 1
Configuration:
	Logical		max	current
	cylinders	0	0
	heads		0	0
	sectors/track	0	0
	--
	Logical/Physical Sector size:           512 bytes
	device size with M = 1024*1024:           0 MBytes
	device size with M = 1000*1000:           0 MBytes 
	cache/buffer size  = unknown
Capabilities:
	IORDY not likely
	Cannot perform double-word IO
	R/W multiple sector transfer: not supported
	DMA: not supported
	PIO: pio0 

root@es-group-22xc:/home/gogua# lshw -class disk -class storage
  *-scsi                    
       physical id: 1
       logical name: scsi0
     *-disk
          description: SCSI Disk
          product: PersistentDisk
          vendor: Google
          physical id: 0.1.0
          bus info: scsi@0:0.1.0
          logical name: /dev/sda
          version: 1
          size: 100GiB (107GB)
          capabilities: gpt-1.00 partitioned partitioned:gpt
          configuration: ansiversion=6 guid=76f45aa5-496c-4018-bb70-e5450640be1f logicalsectorsize=512 sectorsize=4096

root@es-group-22xc:/home/gogua# time for i in `seq 1 1000`; do     dd bs=4k if=/dev/sda count=1 skip=$(( $RANDOM * 128 )) >/dev/null 2>&1; done

real	0m10.720s
user	0m1.218s
sys	0m0.720s

Upgrade to SSD or scale out to more spinning disks and/or nodes?

Thank you very much for your help

Hello Christian,

I upgraded HDD to SSD, but i have the same problem, every new one request needs to wait the previous one finished and does't work multithreading

can you see iostat results

What can I do to solve this problem?

I am not sure I understand what you mean by this. Can you explain further? What does CPU usage look like now that you have upgraded the storage?

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.