Random Access & Performance


(phobos182) #1

A generic question regarding ElasticSearch / Lucene. I have a lot of indexes, and they are very large. I believe we have about 3 Billion documents that need to be searched very quickly. The data has a lot of fields (10-20) per document, and can range from 1 - 64kb in size. Not all fields are indexed, and a lot of them are stored. I believe we have 5-8 fields that are indexed. We also have some fields which provide term vectors.

Keeping cached results is not very useful. Searches usually do not repeat on the cluster, so caching frequent searches do not help. These indexes together represent 2 years of data, so there is a lot of random access around the shards. When we were using Solr we had a lot of disk access because we could not warm caches as we did not know what searches were going to be performed. These indexes are for research, and very ad-hoc.

My question is this. Would Solid State Disks provide the best result latency since I'm going to have to fetch the documents from disk no matter what? I'm thinking of a 16 node ElasticSearch cluster with 48GB of RAM per node, and 6x 300GB + SSD disks with a great RAID controller (LSI 9280, Areca 1880i, etc..). Put them in RAID0 and have a good replication factor on the cluster. Would this be a good recommendation? I'm looking for ways to scale the performance of the cluster, and adding a ton of rotating hard disks does not seem feasible to get better query times.

Thanks!


(Clinton Gormley) #2

Hi Phobos

Since our indexes are very large, keeping cached results is not very useful.
Searches usually do not repeat on the cluster, so caching frequent searches
do not help.

However, you probably filter your results (eg by category, tags, date
etc). It probably makes sense that these are cached. ES caches various
filters by default, which improves performance.

These indexes together represent 2 years of data, so there is a
lot of random access around the shards. When we were using Solr we had a lot
of disk access because we could not hold the indexes in memory, or OS disk
cache.

There is no doubt that having your data in memory makes things faster.
The beauty of ElasticSearch is the built in clustering, so that you're
not limited by the memory available on a single node. You can spread
out your docs over many nodes.

Also, it's not just ES that needs memory. For instance, ES doesn't
cache the documents themselves in memory, but relies on your kernel's
file cache, so you don't want to assign all your RAM to ES.

Perhaps 60-80% should be for ES - you'll have to watch how your kernel
file caches grow with your data to know what the best % is.

My question is this. Would Solid State Disks provide the best result latency
since I'm going to have to fetch the documents from disk no matter what? I'm
thinking of a 16 node ElasticSearch cluster with 48GB of RAM per node, and
6x 300GB + SSD disks with a great RAID controller (LSI 9280, Areca 1880i,
etc..). Put them in RAID0 and have a good replication factor on the cluster.
Would this be a good recommendation?

You really don't need to share your disks between ES nodes. By default,
ES uses the 'local' gateway as a permanent data store, which is not
shared between nodes. You could configure it to use a shared gateway,
but generally there is no need for this - local gateway performs better.

Another tip: when you are configuring your mapping, don't set each field
to be stored. By default, ES stores your whole doc under the _source
field, which means that it can return the whole doc to you in one go, as
opposed to doing a disk seek for every field.

clint


(phobos182) #3

However, you probably filter your results (eg by category, tags, date
etc). It probably makes sense that these are cached. ES caches various
filters by default, which improves performance.

I would not change this. We do a lot of faceting and filtering, so I would still put aside memory for the cache.

There is no doubt that having your data in memory makes things faster.
The beauty of ElasticSearch is the built in clustering, so that you're
not limited by the memory available on a single node. You can spread
out your docs over many nodes.

Even though this is true, I do not have the cache to order 10-13TB of memory to fit the indexes into memory. Also cache warming comes into play as I would have to read the data from disk to even put it into memory. There's the rub. We would also leave memory for the OS page cache, but it's still a lot of disk access.

You really don't need to share your disks between ES nodes. By default,
ES uses the 'local' gateway as a permanent data store, which is not
shared between nodes. You could configure it to use a shared gateway,
but generally there is no need for this - local gateway performs better.

We would not be using any shared system. Shared nothing is what we would do. I would have each index have a replication factor, and then have each local node run a RAID-0 array for SSD.

Another tip: when you are configuring your mapping, don't set each field
to be stored. By default, ES stores your whole doc under the _source
field, which means that it can return the whole doc to you in one go, as
opposed to doing a disk seek for every field.

Not every field would be indexed. We would be relying on the _source field for a lot of this. But it still does not solve the problem of fast access for retrieving the results. If somebody wants the top 50,000 documents for a search they still have to do a "scroll" search for them. But it still hits the disk on each node, and can be very slow due to the random iops, as well as multiple shards queried.


(Berkay Mollamustafaoglu-2) #4

Just to cover the bases, is there any way to shard the data logically
(rather than automated sharding) so that your queries can be executed
against the entire set of docs?
If any logic is possible to narrow down the scope of the query, you may be
able to keep index sizes smaller, store some metadata that may help
directing the query to the right indices, which may speed things up.

Regards,
Berkay Mollamustafaoglu
mberkay on yahoo, google and skype

On Mon, Jul 18, 2011 at 11:03 AM, phobos182 phobos182@gmail.com wrote:

However, you probably filter your results (eg by category, tags, date
etc). It probably makes sense that these are cached. ES caches various
filters by default, which improves performance.

I would not change this. We do a lot of faceting and filtering, so I would
still put aside memory for the cache.

There is no doubt that having your data in memory makes things faster.
The beauty of ElasticSearch is the built in clustering, so that you're
not limited by the memory available on a single node. You can spread
out your docs over many nodes.

Even though this is true, I do not have the cache to order 10-13TB of
memory
to fit the indexes into memory. Also cache warming comes into play as I
would have to read the data from disk to even put it into memory. There's
the rub. We would also leave memory for the OS page cache, but it's still a
lot of disk access.

You really don't need to share your disks between ES nodes. By default,
ES uses the 'local' gateway as a permanent data store, which is not
shared between nodes. You could configure it to use a shared gateway,
but generally there is no need for this - local gateway performs better.

We would not be using any shared system. Shared nothing is what we would
do.
I would have each index have a replication factor, and then have each local
node run a RAID-0 array for SSD.

Another tip: when you are configuring your mapping, don't set each field
to be stored. By default, ES stores your whole doc under the _source
field, which means that it can return the whole doc to you in one go, as
opposed to doing a disk seek for every field.

Not every field would be indexed. We would be relying on the _source field
for a lot of this. But it still does not solve the problem of fast access
for retrieving the results. If somebody wants the top 50,000 documents for
a
search they still have to do a "scroll" search for them. But it still hits
the disk on each node, and can be very slow due to the random iops, as well
as multiple shards queried.

--
View this message in context:
http://elasticsearch-users.115913.n3.nabble.com/Random-Access-Performance-tp3179288p3179447.html
Sent from the ElasticSearch Users mailing list archive at Nabble.com.


(phobos182) #5

To reduce the complexity around searching, we use index aliases to bring in all shards. So for example one of the data types we index is Blogs. We shard by week to bring our index sizes into something manageable. So if a user wants to search all blogs for the entire time range, they can search the blog index. If they are looking for time ranges, they can just search the shards which contain the data for that range. This works pretty well. We also have multiple data types so this would repeat for all 16+ types that we have.

Since we do not know any of the queries that are coming our way (ad-hoc), we cannot optimize for all patterns. It's not simple key-value lookups, or search for "top" results. Our research analysts might use proximity / fuzzy / term matching across one or all of the shards. They also use faceting quite a bit, as well as term vectors for tag clouds, etc...

They also use stats faceting to gain rollup for time series data. For example, how many blog authors where there per search topic. How many tweets per hash tag over a time series. Etc..

Which is why I was asking this question. With all of the random nature of the system along with the need to get back a lot of search results (scroll) for further processing. Which is the reason why I was considering using SSD's to gain performance and query time. It seems that Lucene is pretty good at returning "top" results relative to your search. But is not very good at retrieving a large sample % of your result set. So if my search hit returned 1,000,000 results, and I wanted a random 10% sample, I need to retrieve 100,000 documents.


(Jason Rutherglen) #6

How many queries per second will be run?

On Mon, Jul 18, 2011 at 7:08 AM, phobos182 phobos182@gmail.com wrote:

A generic question regarding ElasticSearch / Lucene. I have a lot of indexes,
and they are very large. I believe we have about 3 Billion documents that
need to be searched very quickly. The data has a lot of fields (10-20) per
document, and can range from 1 - 64kb in size. Because of this the indexes
together are very, very large.

Since our indexes are very large, keeping cached results is not very useful.
Searches usually do not repeat on the cluster, so caching frequent searches
do not help. These indexes together represent 2 years of data, so there is a
lot of random access around the shards. When we were using Solr we had a lot
of disk access because we could not hold the indexes in memory, or OS disk
cache.

My question is this. Would Solid State Disks provide the best result latency
since I'm going to have to fetch the documents from disk no matter what? I'm
thinking of a 16 node ElasticSearch cluster with 48GB of RAM per node, and
6x 300GB + SSD disks with a great RAID controller (LSI 9280, Areca 1880i,
etc..). Put them in RAID0 and have a good replication factor on the cluster.
Would this be a good recommendation?

Thanks!

--
View this message in context: http://elasticsearch-users.115913.n3.nabble.com/Random-Access-Performance-tp3179288p3179288.html
Sent from the ElasticSearch Users mailing list archive at Nabble.com.


(phobos182) #7

Not too many. Less than 50-100 qps. We do not have this data store public facing, so it's for internal use / research only.


(Jason Rutherglen) #8

You could try SSDs however the concurrency isn't as good as RAM, eg,
it's nearly synchronous. I have used SSDs when the QPS is 1-4 per
second and there are billions of documents.

On Mon, Jul 18, 2011 at 9:40 AM, phobos182 phobos182@gmail.com wrote:

Not too many. Less than 100.

--
View this message in context: http://elasticsearch-users.115913.n3.nabble.com/Random-Access-Performance-tp3179288p3179749.html
Sent from the ElasticSearch Users mailing list archive at Nabble.com.


(phobos182) #9

Considering I have infinite money and could put the entire index into memory. Would you require SSD's to read the data from disk to put in the cache?

I swear this problem would be much easier if I could just return the top "x" results instead of a percentage of the total results.


(ofavre) #10

I swear this problem would be much easier if I could just return the top
"x" results instead of a percentage of the total results.

What about a counthttp://www.elasticsearch.org/guide/reference/api/count.htmlquery
followed by a classic query asking for the top "x" results?

--
Olivier Favre

2011/7/18 phobos182 [via ElasticSearch Users] <
ml-node+3179836-585198525-393975@n3.nabble.com>

Considering I have infinite money and could put the entire index into
memory. Would you require SSD's to read the data to put in the cache?

I swear this problem would be much easier if I could just return the top
"x" results instead of a percentage of the total results.


If you reply to this email, your message will be added to the discussion
below:

http://elasticsearch-users.115913.n3.nabble.com/Random-Access-Performance-tp3179288p3179836.html
To start a new topic under ElasticSearch Users, email
ml-node+115913-1699315842-393975@n3.nabble.com
To unsubscribe from ElasticSearch Users, click herehttp://elasticsearch-users.115913.n3.nabble.com/template/NamlServlet.jtp?macro=unsubscribe_by_code&node=115913&code=b2xpdmllckB5YWthei5jb218MTE1OTEzfDIxMjI2MTYwOTc=.


(Berkay Mollamustafaoglu-2) #11

It may be a good idea to take a look at FusionIO products.
http://www.fusionio.com/products/
The specs look very promising, a lot better than plain SSDs. It would be
great to find out how it performs with ElasticSearch.

Regards,
Berkay Mollamustafaoglu
mberkay on yahoo, google and skype

On Tue, Jul 19, 2011 at 9:53 AM, ofavre olivier@yakaz.com wrote:

I swear this problem would be much easier if I could just return the top

"x" results instead of a percentage of the total results.

What about a counthttp://www.elasticsearch.org/guide/reference/api/count.htmlquery followed by a classic query asking for the top "x" results?

--
Olivier Favre

www.yakaz.com

2011/7/18 phobos182 [via ElasticSearch Users] <[hidden email]http://user/SendEmail.jtp?type=node&node=3182569&i=0

Considering I have infinite money and could put the entire index into
memory. Would you require SSD's to read the data to put in the cache?

I swear this problem would be much easier if I could just return the top
"x" results instead of a percentage of the total results.


If you reply to this email, your message will be added to the discussion
below:

http://elasticsearch-users.115913.n3.nabble.com/Random-Access-Performance-tp3179288p3179836.html
To start a new topic under ElasticSearch Users, email [hidden email]http://user/SendEmail.jtp?type=node&node=3182569&i=1
To unsubscribe from ElasticSearch Users, click here.


View this message in context: Re: Random Access & Performance.http://elasticsearch-users.115913.n3.nabble.com/Random-Access-Performance-tp3179288p3182569.html

Sent from the ElasticSearch Users mailing list archivehttp://elasticsearch-users.115913.n3.nabble.com/at Nabble.com.


(phobos182) #12

The issue is that I cannot get the "top x results". We take the body of the documents, and put it in another system for natural language processing / text analytics. So the documents have to be pulled from ElasticSearch into that other system for real time processing.

As it pertains to SSD / FusionIO, we have used both. But that was the question that I was driving at. To get deep pagination / scroll performance does anybody do this without NAND Flash? I know that I am attempting to use ElasticSearch as a document store, and maybe that is not a good idea. I'm thinking I may have to implement another key=value system like HBase / Riak / etc... and use ElasticSearch as a secondary index. Then pull the documents from that other system instead of ElasticSearch.


(system) #13