Change thread pool search queue_size? yes or not?

mitabrev · August 16, 2017, 2:56pm

Got one question. Working on my elastic stack. Basically it's "developing production".

Got one server with 16GB of RAM, 4 CPUs, ELK 5.4.0. No cluster or extra nodes.

Got no problems with adding data and searching, the problems gets with dashboards with multi visualizations.

I figure, the problem is probably CPU?

With development, also the document size increased as I try to automate as much as possible. So indexes are created daily with around 5-6 mio documents per index. So with few months of data I'm already at several 100 millions of documents.

So when I open a dashboard with multiple visualisation I receive a timeout error in kibana and elasticsearch log says:

Caused by: org.elasticsearch.common.util.concurrent.EsRejectedExecutionException: rejected execution of org.elasticsearch.transport.TransportService$7@59b0d1f1 on EsThreadPoolExecutor[search, queue capacity = 1000, org.elasticsearch.common.util.concurrent.EsThreadPoolExecutor@395e3b7a[Running, pool size = 7, active threads = 7, queued tasks = 1000, completed tasks = 3105230]]

So i'm thinking of maybe increase the search queue_size. Would that even make sense? What would be the optimal size in my situation? Best would be probably to add two more nodes and create a cluster.

Thanks!

Christian_Dahlqvist · August 16, 2017, 3:06pm

How many indices and shards are you creating per day? What is your average shard size?

mitabrev · August 16, 2017, 3:11pm

1 index and 1 shard per day. Size of data in index is around 2 GB per index.

Christian_Dahlqvist · August 16, 2017, 3:18pm

That sounds quite reasonable. How many visualisations do you have on the affected dashboards? Do you have X-Pack Monitoring installed? What does CPU usage and disk I/O look like when you experience the timeout?

mitabrev · August 16, 2017, 6:24pm

5 visualization. Yes, x-pack is installed. The only thing I really see out of the ordinary is the System load which rose over 8.00.

Then the elasticsearch just quit/killed for few minutes.

So my asumption is lackage of CPU?

Christian_Dahlqvist · August 16, 2017, 6:25pm

What is your heap size? What does GC graph look like in monitoring?

mitabrev · August 17, 2017, 7:01am

-Xms8g
-Xmx8g

GC count young jumped from 8 to 23, duration from 230 ms to 781 ms, Cgroup CPU utilization from 26% to 66% and Cgroup usage from 35b ns to 53.7b ns.

Other graphs had no significant changes.

mitabrev · September 5, 2017, 2:17pm

You have any tips or suggestions?

tnx

Christian_Dahlqvist · September 6, 2017, 6:44am

How large are your indices/shards? How many are you querying across when you encounter problems?

mitabrev · September 6, 2017, 7:07am

Indices are on average around 2,5 GB with 6mio documents, created daily. Default search/overview is last 24 hours.
Increased the cpu cores, but still encounter time outs. Even with only 2-3 visualizations on dashboard.

Christian_Dahlqvist · September 6, 2017, 7:12am

What type of storage do you have? What does disk I/O and iowait look like? How many concurrent queries?

mitabrev · September 6, 2017, 8:05am

Looks like a disk reading issue...
As soon as I open dashboard the values increase...

From:
Device: rrqm/s wrqm/s r/s w/s rkB/s wkB/s avgrq-sz avgqu-sz await r_await w_await svctm %util
sda 0.00 0.00 0.00 0.80 0.00 8.00 20.00 0.00 0.00 0.00 0.00 0.00 0.00
sdb 0.00 0.00 0.00 2.20 0.00 16.20 14.73 0.00 0.55 0.00 0.55 0.36 0.08

To:
Device: rrqm/s wrqm/s r/s w/s rkB/s wkB/s avgrq-sz avgqu-sz await r_await w_await svctm %util
sda 0.00 659.20 36.80 25.40 946.40 2744.00 118.66 0.29 4.73 4.66 4.83 1.54 9.60
sdb 0.00 0.40 3645.00 11.00 1582319.20 46.70 865.63 130.93 35.73 35.77 20.13 0.27 99.98

Christian_Dahlqvist · September 6, 2017, 8:20am

Yes, it looks like your storage is indeed the bottleneck.

system · October 4, 2017, 8:21am

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Elasticsearch sizing and queue capacity Elasticsearch	7	19339	July 5, 2017
Increasing thread pool / queue size Elasticsearch	3	667	July 6, 2017
Elasticsearch problem, search thread pool rejected Elasticsearch	4	34218	February 8, 2018
Search-queued-tasks more than 1000 Elasticsearch	5	418	November 8, 2018
Thread Pool Configuration - Max thread_pool.size? Elasticsearch	11	2083	May 14, 2020

Change thread pool search queue_size? yes or not?

Related topics