Single Node Performance and periodic latency spikes

Hi Everyone,

we are currently conducting experiments to evaluate Elasticsearch for our Use-Case. While doing performance tests on a single Node instance, we came across some periodic search-latency spikes.

We are running an OR query with a static and random string to prevent caching. We ramp up the concurrent users from 1-50. We tried both single shard and 5 shards index. The query contains a lot of aggregations, content specific weighting base on a function_score query multiple filters and field collapsing.

Single shard performance seems to be worse for few concurrent users but better for 20-50 concurrent users. Both single and 5 shard indices are showing latency spikes every 2-3 Minutes. (see atached jmeter graphs). Can this be related to garbage collection? Did anyone come across similar behavior before and knows a mitigation strategy?

Best

Samy

Have you correlated the spikes again st GC?

Working on it. Wer are using the hostes aws elasticsearch service, i'm figuring out a way to get exact GC data.

What kind of instance are you using?

c5.12xlarge with around 250.000 documents