Many small indices vs one large index?


(Hari Shankar) #1

Hi,

Is there any obvious problem if I have a huge number of indices? Say I keep
10 indices for each customer, and I have 1000 customers, which would mean
10000 indices. I was thinking of this approach because I have a few
customers with 10 times as much data as most of the other customers. If I
keep all customer data in 10 indices, all the smaller customers would also
have to wait longer for data because of a few large customers, since
queries, updates etc will take longer. Moreover, I'd never have to
cross-query between customers. So data for each of my customers is
independent. Is this approach feasible at a high level? I thought I'd ask
here because it is quite difficult to recreate it in a test environment.

Hari


(Karussell) #2

This is feasible although I'd go to put it into 'one system' and doing
some other index-splitting (e.g. based on date)
so that also customers with a big index will benefit of this. And you
would also not waste memory etc (every index comes with some overhead)
for very small indices.

On Jun 25, 6:57 am, Hari Shankar shaan.h...@gmail.com wrote:

Hi,

Is there any obvious problem if I have a huge number of indices? Say I keep
10 indices for each customer, and I have 1000 customers, which would mean
10000 indices. I was thinking of this approach because I have a few
customers with 10 times as much data as most of the other customers. If I
keep all customer data in 10 indices, all the smaller customers would also
have to wait longer for data because of a few large customers, since
queries, updates etc will take longer. Moreover, I'd never have to
cross-query between customers. So data for each of my customers is
independent. Is this approach feasible at a high level? I thought I'd ask
here because it is quite difficult to recreate it in a test environment.

Hari


(Hari Shankar) #3

Thanks a lot for responding. In fact I'd need to query all-time data, so
maybe instead of time-based groups, we'd have to develop some other grouping
strategy to split indices, maybe grouping customers such that total index
size of each group is approximately same.

The biggest issue I am facing with having a single large index is update
times, it takes almost a minute to update 200,000 records. I have
replication set to 3, but reducing it to 1 did not have a huge impact on
update times. I have 7 shards on 7 machines currently. What other parameters
can I tweak to improve update times?

Thanks,
Hari

On Sat, Jun 25, 2011 at 4:19 PM, Karussell tableyourtime@googlemail.comwrote:

This is feasible although I'd go to put it into 'one system' and doing
some other index-splitting (e.g. based on date)
so that also customers with a big index will benefit of this. And you
would also not waste memory etc (every index comes with some overhead)
for very small indices.

On Jun 25, 6:57 am, Hari Shankar shaan.h...@gmail.com wrote:

Hi,

Is there any obvious problem if I have a huge number of indices? Say I
keep
10 indices for each customer, and I have 1000 customers, which would mean
10000 indices. I was thinking of this approach because I have a few
customers with 10 times as much data as most of the other customers. If I
keep all customer data in 10 indices, all the smaller customers would
also
have to wait longer for data because of a few large customers, since
queries, updates etc will take longer. Moreover, I'd never have to
cross-query between customers. So data for each of my customers is
independent. Is this approach feasible at a high level? I thought I'd ask
here because it is quite difficult to recreate it in a test environment.

Hari


(Clinton Gormley) #4

On Sat, 2011-06-25 at 19:17 +0530, Hari Shankar wrote:

Thanks a lot for responding. In fact I'd need to query all-time data,
so maybe instead of time-based groups, we'd have to develop some other
grouping strategy to split indices, maybe grouping customers such that
total index size of each group is approximately same.

Index aliases can point to more than one index (for read purposes). So
you could have two aliases:

  • index_read: [ index_2009,index_2010,index_2011]
  • index_write: index_2011

So your app would always write to 'index_write' and read from
'index_read'. All you need to do then is to create a new index amd
update the aliases once a year.

The biggest issue I am facing with having a single large index is
update times, it takes almost a minute to update 200,000 records. I
have replication set to 3, but reducing it to 1 did not have a huge
impact on update times. I have 7 shards on 7 machines currently. What
other parameters can I tweak to improve update times?

You could reduce the refresh_interval from its default 1 second, add
more primary shards, play with the translog settings, get faster
machines,.... running out of options here.

3,200 records per second sounds quite fast to me, but maybe i'm not
aiming high enough :slight_smile:

clint


(Hari Shankar) #5

On Sat, Jun 25, 2011 at 7:51 PM, Clinton Gormley clinton@iannounce.co.ukwrote:

On Sat, 2011-06-25 at 19:17 +0530, Hari Shankar wrote:

Thanks a lot for responding. In fact I'd need to query all-time data,
so maybe instead of time-based groups, we'd have to develop some other
grouping strategy to split indices, maybe grouping customers such that
total index size of each group is approximately same.

Index aliases can point to more than one index (for read purposes). So

you could have two aliases:

  • index_read: [ index_2009,index_2010,index_2011]
  • index_write: index_2011

So your app would always write to 'index_write' and read from
'index_read'. All you need to do then is to create a new index amd
update the aliases once a year.

Won't this reduce read/search efficiency, since it now has to query more
indices and merge, e.g if I have to sort? Or is this overhead small?

The biggest issue I am facing with having a single large index is
update times, it takes almost a minute to update 200,000 records. I
have replication set to 3, but reducing it to 1 did not have a huge
impact on update times. I have 7 shards on 7 machines currently. What
other parameters can I tweak to improve update times?

You could reduce the refresh_interval from its default 1 second, add
more primary shards, play with the translog settings, get faster
machines,.... running out of options here.

I tried setting refresh_interval to -1 but it did not have to seem much of
an impact. I was using bulk indexer Java API. I guess the bulk indexer
handles the refresh this itself? Can I force it to use all cores of the CPU?
I did an htop on all machines and found that only one CPU was usually being
used ~100% whereas others were relatively idle. Increasing ES_MAX_MEM helped
until a point, but now it has no impact so I guess RAM is sufficient.

One thing I wanted to confirm - What will be more advantageous w.r.t
improving indexing speed: adding more machines but fixing the number of
shards at 7 or adding more machines and also increasing no. of shards
correspondingly?

Can I make replication asynchronous?

3,200 records per second sounds quite fast to me, but maybe i'm not
aiming high enough :slight_smile:

Yeah, I know... it is fast if you ask me considering this is an index and
not a DB, but who can explain that to the marketing guys :slight_smile: .. Our earlier
system was completely based on RDBMS, now we are trying to move to HBase
(for writes and batch processing) + es (for the front-end reads). On most
fronts, the new system seems to be performing much better (as expected),
than the older one, there are these few areas where the earlier system was
stronger (seems), like updates and joining. We are just trying to close the
gap in these areas as much as possible.. :slight_smile:

Thanks for your time..
Hari


(Shay Banon) #6

On Saturday, June 25, 2011 at 6:34 PM, Hari Shankar wrote:

On Sat, Jun 25, 2011 at 7:51 PM, Clinton Gormley <clinton@iannounce.co.uk (mailto:clinton@iannounce.co.uk)> wrote:

On Sat, 2011-06-25 at 19:17 +0530, Hari Shankar wrote:

Thanks a lot for responding. In fact I'd need to query all-time data,
so maybe instead of time-based groups, we'd have to develop some other
grouping strategy to split indices, maybe grouping customers such that
total index size of each group is approximately same.

Index aliases can point to more than one index (for read purposes). So
you could have two aliases:

  • index_read: [ index_2009,index_2010,index_2011]
  • index_write: index_2011

So your app would always write to 'index_write' and read from
'index_read'. All you need to do then is to create a new index amd
update the aliases once a year.

Won't this reduce read/search efficiency, since it now has to query more indices and merge, e.g if I have to sort? Or is this overhead small?
The overhead is combining search across shards, not across indices. Searching on 10 indices with 1 shard is the same as searching 1 index with 10 shards. The overhead of searching across shards is not high, it uses non blocking IO to communicate with all the nodes, and (usually) needs to merge a small set of ids / scores.

The biggest issue I am facing with having a single large index is
update times, it takes almost a minute to update 200,000 records. I
have replication set to 3, but reducing it to 1 did not have a huge
impact on update times. I have 7 shards on 7 machines currently. What
other parameters can I tweak to improve update times?

You could reduce the refresh_interval from its default 1 second, add
more primary shards, play with the translog settings, get faster
machines,.... running out of options here.

I tried setting refresh_interval to -1 but it did not have to seem much of an impact. I was using bulk indexer Java API. I guess the bulk indexer handles the refresh this itself? Can I force it to use all cores of the CPU? I did an htop on all machines and found that only one CPU was usually being used ~100% whereas others were relatively idle. Increasing ES_MAX_MEM helped until a point, but now it has no impact so I guess RAM is sufficient.

One thing I wanted to confirm - What will be more advantageous w.r.t improving indexing speed: adding more machines but fixing the number of shards at 7 or adding more machines and also increasing no. of shards correspondingly?
Are you indexing from a single thread using the batch API?

Can I make replication asynchronous?
Yes, you can set the replication type to be async.

3,200 records per second sounds quite fast to me, but maybe i'm not
aiming high enough :slight_smile:
Yeah, I know... it is fast if you ask me considering this is an index and not a DB, but who can explain that to the marketing guys :slight_smile: .. Our earlier system was completely based on RDBMS, now we are trying to move to HBase (for writes and batch processing) + es (for the front-end reads). On most fronts, the new system seems to be performing much better (as expected), than the older one, there are these few areas where the earlier system was stronger (seems), like updates and joining. We are just trying to close the gap in these areas as much as possible.. :slight_smile:
The perf of how many docs per second really depends on many factors, including the type of docs you index. Other things that can help is disabling the _all field, increasing the number of shards per index, indexing less fields (simply disable indexing on fields you are not going to query / facet on).

Thanks for your time..
Hari


(Hari Shankar) #7

On Sun, Jun 26, 2011 at 3:14 AM, Shay Banon shay.banon@elasticsearch.comwrote:

The overhead is combining search across shards, not across indices.
Searching on 10 indices with 1 shard is the same as searching 1 index with
10 shards. The overhead of searching across shards is not high, it uses non
blocking IO to communicate with all the nodes, and (usually) needs to merge
a small set of ids / scores.

Ah, ok.. great. Thanks for explaining this Shay.. :slight_smile:
So more shards => greater write speed but lesser read speed and more
replicas => greater read speed but lesser write speed?

Are you indexing from a single thread using the batch API?

Yes, currently I am using a single thread, unless the API itself does
multi-threading in background. Is there an API that does this, or is there a
setting which makes this multithreaded?

Yes, you can set the replication type to be async.

Found it.. thanks!
Hari


(Karussell) #8

you could also tune lucene settings:

http://www.elasticsearch.org/guide/reference/api/admin-indices-update-settings.html

E.g. read the 'Bulk Indexing Usage' section

it takes almost a minute to update 200,000 records.

are you updating at once or dividing the records into chunks of 5000
or so? also as shay points out, you can use multiple threads to feed
the index.

Regards,
Peter.


(Hari Shankar) #9

Awesome! We could improve the speed by ~5x by using 10 threads.... :slight_smile: I
think that is more than enough for our requirements..

Thanks a lot guys!

On Mon, Jun 27, 2011 at 7:07 PM, Karussell tableyourtime@googlemail.comwrote:

you could also tune lucene settings:

http://www.elasticsearch.org/guide/reference/api/admin-indices-update-settings.html

E.g. read the 'Bulk Indexing Usage' section

it takes almost a minute to update 200,000 records.

are you updating at once or dividing the records into chunks of 5000
or so? also as shay points out, you can use multiple threads to feed
the index.

Regards,
Peter.


(Shay Banon) #10

Just to complete the picture and explain how the bulk indexing works. A single bulk request ends up being broken down into the relevant bulk items per shard, and then, executed on that shard. The bulk operations are executed in a serial manner. If you have more shards, then more operation will happen concurrently, if you have more shards then nodes, then you already doing concurrent indexing on the same node. So, threading / more processes usually helps to improve on that.

On Monday, June 27, 2011 at 5:07 PM, Hari Shankar wrote:

Awesome! We could improve the speed by ~5x by using 10 threads.... :slight_smile: I think that is more than enough for our requirements..

Thanks a lot guys!

On Mon, Jun 27, 2011 at 7:07 PM, Karussell <tableyourtime@googlemail.com (mailto:tableyourtime@googlemail.com)> wrote:

you could also tune lucene settings:

http://www.elasticsearch.org/guide/reference/api/admin-indices-update-settings.html

E.g. read the 'Bulk Indexing Usage' section

it takes almost a minute to update 200,000 records.

are you updating at once or dividing the records into chunks of 5000
or so? also as shay points out, you can use multiple threads to feed
the index.

Regards,
Peter.


(system) #11