Setup to minimize search latency

Dustin_Boswell · May 21, 2014, 7:33pm

Hi everyone,
I have a 10 million document index, and multiple high-memory (e.g. 250GB
ram, 32 cores) machines available. I'd like to do everything possible to
keep search latency as low as possible (< 50ms ideally), especially during
a high-throughput environment. I know it depends a lot on the query, but
to start with I'm asking about general index/cluster settings.

Here's a list of things I'm doing so far:

ES_HEAP_SIZE=100g
machine has no swap
20 shards

Are there any other settings or search parameters I should be aware of?

Also, I'm wondering how many shards is recommended in my case. Having more
shards helps reduce latency by parallelizing the work, but at some point
the overhead of fanning out the requests and collecting the partial results
will take over and latency would get worse. Is there a rule of thumb for a
sweet spot that others have found?

The volume of updates to the index is relatively small (500K/day), but
bursty. From initial testing, it seems like updates being issued can
increase the search latency happening on the same machine. Is there a good
way to "isolate" search and updates, either by some setting, or splitting
up the cluster somehow to have dedicated update nodes and dedicated search
nodes? (Not sure how you'd deploy a setup like this, or control where the
search/update calls went.)

The query I'm optimizing for will have a text search component and a
geo-restrict component, maybe something like this:
{
"query": {
// query may get more complex in the future
"match": { "_all": "my search terms" }
},
"filter": {
"geo_distance": {
"distance": "100km",
"location": {
"lat": 34.04,
"lon": -118.49
}
}
}
}

For the geo filter, I've tried the optimize_bbox option, and the default of
"memory" seemed to work the best, surprisingly. I haven't tried using
geohash yet, and I can't tell from the docs how one might use it, but maybe
that is inherently faster since it uses indexes?

Unfortunately, there are a lot of unique locations in my query stream, so I
don't know if caching this filter will work. (Each filter cache consumes
about 1 bit in memory per document, is that right? So about 1.25MB in my
case. Storing the most frequent 10,000 of these would take up about 12.5GB
of ram. So maybe that's doable...)

Sorry if that's a lot of questions, but I figured other people may benefit
from this thread too.
Thanks for any help.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/e8e878ac-76d6-4233-a8e5-21908bd33e84%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

jpountz · May 23, 2014, 12:27am

On Wed, May 21, 2014 at 9:33 PM, Dustin Boswell dboswell@gmail.com wrote:

Hi everyone,
I have a 10 million document index, and multiple high-memory (e.g. 250GB
ram, 32 cores) machines available. I'd like to do everything possible to
keep search latency as low as possible (< 50ms ideally), especially during
a high-throughput environment. I know it depends a lot on the query, but
to start with I'm asking about general index/cluster settings.

Here's a list of things I'm doing so far:

ES_HEAP_SIZE=100g

machine has no swap

20 shards

Are there any other settings or search parameters I should be aware of?

If you care about latency, it might make sense to configure a smaller heap
size. The issue with large heaps is that they take longer to collect,
especially for collections of the old generation (I'm talking about minutes
here). I would recommend setting the heap size at at most 30GB (which will
also allow you to benefit from compressed pointers), one way to go this
route could be to start several nodes per physical machine.

Also, I'm wondering how many shards is recommended in my case. Having more
shards helps reduce latency by parallelizing the work, but at some point
the overhead of fanning out the requests and collecting the partial results
will take over and latency would get worse. Is there a rule of thumb for a
sweet spot that others have found?

If you don't have a lot of traffic, you could think about configuring
num_shards = total_num_cpus / num_concurrent_queries. But as you said,
there is also some overhead to large numbers of shards so this deserves
testing.

The volume of updates to the index is relatively small (500K/day), but
bursty. From initial testing, it seems like updates being issued can
increase the search latency happening on the same machine. Is there a good
way to "isolate" search and updates, either by some setting, or splitting
up the cluster somehow to have dedicated update nodes and dedicated search
nodes? (Not sure how you'd deploy a setup like this, or control where the
search/update calls went.)

The query I'm optimizing for will have a text search component and a
geo-restrict component, maybe something like this:
{
"query": {
// query may get more complex in the future
"match": { "_all": "my search terms" }
},
"filter": {
"geo_distance": {
"distance": "100km",
"location": {
"lat": 34.04,
"lon": -118.49
}
}
}
}

For the geo filter, I've tried the optimize_bbox option, and the default
of "memory" seemed to work the best, surprisingly. I haven't tried using
geohash yet, and I can't tell from the docs how one might use it, but maybe
that is inherently faster since it uses indexes?

The bbox optimization is useful if your geo query matches a small portion
of your index. Maybe the issue here is that with such a large radius, you
match most of your documents?

Unfortunately, there are a lot of unique locations in my query stream, so
I don't know if caching this filter will work. (Each filter cache consumes
about 1 bit in memory per document, is that right? So about 1.25MB in my
case. Storing the most frequent 10,000 of these would take up about 12.5GB
of ram. So maybe that's doable...)

One thing to beware of is that if you cache a geo filter, Elasticsearch
will need to evaluate it against all documents from your index before
caching it. On the other hand by default (if the filter is not cached), the
filter will be only evaluated on documents that match the query (but for
all queries not the first one). So unless you have good reasons to think
that your geo filter will be reused, I would recommend against caching it.

--
Adrien Grand

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAL6Z4j6Y%2BdgRns8JwcuigDprR0_Yr8PN58cyfKhYVcFYnE4m7g%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

Topic		Replies	Views
# of shards and filter latency Elasticsearch	2	460	May 3, 2017
Lower latency by using more shards (with routing) Elasticsearch	2	397	March 9, 2023
Performance problems when searching in-memory index with 15M documents Elasticsearch	6	1094	July 5, 2017
High latency on search Elasticsearch	17	8362	July 5, 2017
Search cache question Elasticsearch	1	333	July 6, 2017

Setup to minimize search latency

Related topics