Elastic search queue choking

cosmos_roeba · September 13, 2023, 6:28am

I am running a 2 node cluster with 2 core cpus. I have 2 indices with about 1 million docs each. They are flat documents and I need to enable search on title key stored both as text and keyword. Search text is minimum 3 characters long.

Initial search queries:
Earlier I was using wildcard queries to make the search and from time to time when the no. of requests spikes to even 1k requests, I got search phase exception with search queue crossing 1k limit.

Now we have moved to n-gram search on the title, individually looking at the searches it shows search improvements no doubt but load testing even with 1k requests ramped up in 10 seconds is choking the search queue.

I want to understand what may be causing the search queue to fill up even with load as less as 1k requests in 10 seconds.

Note:
Upon load testing, I've also noticed if run 3 tests in a row with the same load (1k requests ramped up in 10 seconds), the first request fills up the queue to about 500, half it's limit. When I ran the second test right after the first one finished, it queued up till about 800 and the third one crossed 1k limit almost as quickly as it ran.
I'm exploring caching/memory cleanups while writing this post to understand what may be at play here?

Ideally - I'm load testing my system to serve up to 50k requests a minute. I want to understand if I'm hitting the cluster limit or are there still scope of improvements in my mapping/queries?

My use case is search heavy.

Christian_Dahlqvist · September 13, 2023, 7:22am

You have left out some important information:

Which version of Elasticsearch are you using?
How much RAM does each node have? How large is the heap?
What is the size of the indices and shards you are querying? How many primary shards do the indices have? Do you have a replica enabled?
Are you indexing new data or performing updates? If so, do you do so evenly over time or at specific times?

cosmos_roeba · September 13, 2023, 7:46am

Right, here you go.

Elastic version: 6.8.0
Lucene version: 7.7.0
Ram/node: 8Gb
Heap/instance: 4Gb
Index size (avg): 600 mb
Shard size: 95 mb (replica enabled)

At the moment, we have 3 shards per index with replica enabled. Although we've reproduced same results with 1 shard and 5 shards per index as well.
New data is added at a low frequency per day - about 1k-1.5k per day evenly divided by cron jobs every hour. Hardly any updates to the existing data.

Christian_Dahlqvist · September 13, 2023, 7:58am

I would recommend the following:

Change the indices, which are very small, to only have 1 primary shard. Anything more is a waste of resources and will reduce performance.
Upgrade to the latest version of Elasticsearch. Version 6.8 is very old and has been EOL a long time.
You data size is small enough to comfortable fit in the operating system page cache, so that is all good. If you add more data, ensure this is the case as it means you will not be limited by disk I/O. Ensure that all RAM on the nodes are exclusively available to Elasticsearch.
Before benchmarking, make sure you forcemerge each index to to a single segment.

When you benchmark, start with a benchmark at a lower concurrency level, e.g. 100 queries/second, without any indexing taking place. Do not immediately jump to what you want to achieve. Keep track of the latencies experienced. If these are acceptable, run the test multiple times and gradually increase the concurrency, e.g. increase by 100 at a time. This will allow you to see clearly at which point your latencies get too long and you have exceeded what your small cluster can handle. At that point try to increase CPU allocation and see what difference that makes. Make sure queries are evenly distributed across the nodes in the cluster.

Once you have optimised the query performance you can start adding updates and indexing and see what impact this has. As forcemerging down to a single segment can help performance it may be beneficial to apply changes less frequently and forcemerge after every batch of updates.

Note that 2CPU cores per data node is very little and may not be sufficient

cosmos_roeba · September 13, 2023, 9:26am

Thank you for such prompt responses Christian. Let me try the suggested approach.

Also, as my use case of having partial search on a key would not be changing anytime soon. And new data will also be getting indexed everyday going forward. I’m using n-gram with (min:3, max 10). I’d like to know if there is a better better way to do partial searches altogether? Because as the throughput increases, queue starts filling up and the response time increases marginally. I’d like to make sure queries and mappings are not playing any more part in this bottleneck before moving ahead with architecture level upgrades

Christian_Dahlqvist · September 13, 2023, 9:52am

In recent versions of Elasticsearch you have a new wildcard field type that uses ngrams behind the scenes and makes wildcard queries faster. I would recommend you upgrade so you can use this as it seems to fit your use case. Partial searches through ngrams or wildcard queries will always be slower than straingt term lookups as more computation work is required, so while you can optimise these they will still have an impact.

system · October 11, 2023, 9:53am

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.