Can I control max concurrent shard request about APM dashboard?


(Jung Juhong) #1

I could speed up for discover and visualize with max concurrent shard request option.
But I looks like there isn't max concurrent shard request parameter about APM.
Is there any another options?


(Christian Dahlqvist) #2

How many indices and shards are you querying?


(Jung Juhong) #3

My apm transaction and span indices are created daily and each indices are consisted of 20 shard.
There are only 4 days but there will be more indices.


(Christian Dahlqvist) #4

Why do you have 20 shards per index? How many daily indices do you create? How much data do you ingest per day?

Having lots of small indices and shards can be very inefficient, so try to follow the guidelines outlined in this blog post about shards and sharding practices.


(Jung Juhong) #5

Transaction indices are almost 1TB per daily. I though 5 shards is too few for big size indices.
Hourly indices is more better in this case?


(Christian Dahlqvist) #6

What is the average shard size? What is your retention period for this data?


(Jung Juhong) #7

Sorry. 1TB was invalid.
Primary index size is almost 350GB ~ 400GB and I suppose average shard size is about 20GB.
There isn't enough data yet but my goal of retention period is 7 to 14 days.


(Christian Dahlqvist) #8

How many nodes do you have in your cluster? What is the specification of these nodes? What type of storage are you using?


(Jung Juhong) #9

There are 8 i3.2xlarge(NVMe SSD) ec2 instances. Maybe i will need to add more nodes.


(Christian Dahlqvist) #10

If you have only one daily index and your average shard size is indeed 20GB, it sounds quite reasonable. If you are experiencing performance problems I would look at the cluster to see what is likely to limit performance. As your nodes have fast SSDs I would look at CPU utilisation as querying and indexing can be CPU intensive. Also look out for any issues related to GC in the Elasticsearch logs.


(Jung Juhong) #11

Thanks for your advice.
I checked each node's CPU utilization and it was very high when I query.
But is there any solution without scale out?
Here is my monitoring metrics.
CPU spike time is when I queried.


(Christian Dahlqvist) #12

It looks like you at times are limited by CPU, so scaling out is one way to address it. Otherwise you may need to try and reduce CPU usage, but I am not sure how to best go about that.