we are currently conducting experiments to evaluate Elasticsearch for our Use-Case. While doing performance tests on a single Node instance, we came across some periodic search-latency spikes.
We are running an OR query with a static and random string to prevent caching. We ramp up the concurrent users from 1-50. We tried both single shard and 5 shards index. The query contains a lot of aggregations, content specific weighting base on a function_score query multiple filters and field collapsing.
Single shard performance seems to be worse for few concurrent users but better for 20-50 concurrent users. Both single and 5 shard indices are showing latency spikes every 2-3 Minutes. (see atached jmeter graphs). Can this be related to garbage collection? Did anyone come across similar behavior before and knows a mitigation strategy?