Cluster sizing

Hello there, I was wondering If anyone here mind sharing a bit about sizing
the cluster.

Today we have a 5 node ES cluster (each node has 64GB RAM 12 cores), our
index is around 96gb spread across 12 shards. We have replica = 1 And each
ES instance is set to have 31GB of Xmx.

Recently we are facing some outages, and we need to expand, but is hard to
get another beefy machine like that, so we are considering getting several
squirrels instead of one bear :slight_smile:

My thoughts were:

64 medium nodes on amazon, each with 7.5Gb RAM, increase our shard numbers
to a large number like 64, so each shard would have around 1.5gb only, and
set replica to 2 so each node would host 3 shards only. And the rationale
behind that would be: 3x 1.5gb = 4.5gb wich is around what we expect to
have as filesystem cache memory reserved. 4gb for ES and leave 3.5 for the
OS

I would love to have some feedback of the community on this approach, my
initial take is that having several smaller instances, we can always add a
few dozen more from time to time.

Any thoughts?

Regards

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

Hi,

I didn't double-check your math, but I don't see anything wrong with this
approach, other than more management and higher chances of one of those 64
instances going down. Also note that your dedicated (I assume) server
can't really be compared to EC2 VMs when it comes to performance... so I
would be a little conservative when making calculations.

Otis

ELASTICSEARCH Performance Monitoring - http://sematext.com/spm/index.html

On Monday, March 18, 2013 4:49:19 PM UTC-4, Vinicius Carvalho wrote:

Hello there, I was wondering If anyone here mind sharing a bit about
sizing the cluster.

Today we have a 5 node ES cluster (each node has 64GB RAM 12 cores), our
index is around 96gb spread across 12 shards. We have replica = 1 And each
ES instance is set to have 31GB of Xmx.

Recently we are facing some outages, and we need to expand, but is hard to
get another beefy machine like that, so we are considering getting several
squirrels instead of one bear :slight_smile:

My thoughts were:

64 medium nodes on amazon, each with 7.5Gb RAM, increase our shard numbers
to a large number like 64, so each shard would have around 1.5gb only, and
set replica to 2 so each node would host 3 shards only. And the rationale
behind that would be: 3x 1.5gb = 4.5gb wich is around what we expect to
have as filesystem cache memory reserved. 4gb for ES and leave 3.5 for the
OS

I would love to have some feedback of the community on this approach, my
initial take is that having several smaller instances, we can always add a
few dozen more from time to time.

Any thoughts?

Regards

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

Hiya

Today we have a 5 node ES cluster (each node has 64GB RAM 12 cores),
our index is around 96gb spread across 12 shards. We have replica = 1
And each ES instance is set to have 31GB of Xmx.

You're getting outages with an index of only 96GB, on those machines?
That surprises me. I'd be looking for what is causing the outages,
rather than changing cluster size.

64 medium nodes on amazon, each with 7.5Gb RAM, increase our shard
numbers to a large number like 64, so each shard would have around
1.5gb only, and set replica to 2 so each node would host 3 shards
only. And the rationale behind that would be: 3x 1.5gb = 4.5gb wich is
around what we expect to have as filesystem cache memory reserved. 4gb
for ES and leave 3.5 for the OS

Dividing such a small index into 64 shards will probably skew the term
frequency distribution, which may result in unexpected results.

Also, the smaller the instance on EC2, the poorer the IO throughput, the
noisier the neighbours etc.

Again, I'd try to figure out why you're getting outages - it is probably
quite easy to solve.

clint

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

Hi Clinton, yeah we are having some problems here. We narrow to some nasty
queries that frontend started to execute, it seems that those queries (some
count queries) were ok, but the moment we started having more and more of
them (they had a spike in numbers) we started seen nodes getting the
threadpool (search) number spike to several thousand, this node would then
get removed from the cluster (I'm assuming it would be unresponsive), after
that it's a snowball, the shards starts to be relocated, IO gets high,
another node threadpool search goes crazy high in number, and again the
cluster gets removed, we are left with 2 nodes only, and since we have
zen.discovery = 3, most of the time, we are left without a master (don't
know how bad that is)... final result : whole cluster restart.

We got some es consulting, which was amazing, we are now looking for
production subscription, but we really need to get this a bit more stable.

I'm 100% sure that the blame is on us, we are just running bad queries, I
only wish I had the time to look into it, but as in most companies
management only cares about design when all hell breaks loose like now. Now
every manager wants to get a new machine, or we should dedicate more time
on designing queries, but before that, is just: "Can you deliver this
feature this afternoon?"

Regards

On Tuesday, March 19, 2013 7:49:06 AM UTC-4, Clinton Gormley wrote:

Hiya

Today we have a 5 node ES cluster (each node has 64GB RAM 12 cores),
our index is around 96gb spread across 12 shards. We have replica = 1
And each ES instance is set to have 31GB of Xmx.

You're getting outages with an index of only 96GB, on those machines?
That surprises me. I'd be looking for what is causing the outages,
rather than changing cluster size.

64 medium nodes on amazon, each with 7.5Gb RAM, increase our shard
numbers to a large number like 64, so each shard would have around
1.5gb only, and set replica to 2 so each node would host 3 shards
only. And the rationale behind that would be: 3x 1.5gb = 4.5gb wich is
around what we expect to have as filesystem cache memory reserved. 4gb
for ES and leave 3.5 for the OS

Dividing such a small index into 64 shards will probably skew the term
frequency distribution, which may result in unexpected results.

Also, the smaller the instance on EC2, the poorer the IO throughput, the
noisier the neighbours etc.

Again, I'd try to figure out why you're getting outages - it is probably
quite easy to solve.

clint

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

Hiya

On Tue, 2013-03-19 at 07:21 -0700, Vinicius Carvalho wrote:

Hi Clinton, yeah we are having some problems here. We narrow to some
nasty queries that frontend started to execute, it seems that those
queries (some count queries) were ok, but the moment we started having
more and more of them (they had a spike in numbers) we started seen
nodes getting the threadpool (search) number spike to several
thousand, this node would then get removed from the cluster (I'm
assuming it would be unresponsive), after that it's a snowball, the
shards starts to be relocated, IO gets high, another node threadpool
search goes crazy high in number, and again the cluster gets removed,
we are left with 2 nodes only, and since we have zen.discovery = 3,
most of the time, we are left without a master (don't know how bad
that is)... final result : whole cluster restart.

yeah, you want to avoid that :slight_smile:

I'd start with configuring the various threadpools (eg search) to
prevent thread explosions. By default, ES will just keep trying to
create new threads. It is quite difficult to come up with good defaults
for threadpools, which is why it is currently configured this way, but
with knowledge of your system, you should be able to come up with the
"right" numbers for fixed threadpools and queue size.

That at least should stop your cluster falling over.

Then why you are having problems with queries. The usual issue here is
the amount of RAM you have available. However, you've already got your
heap as big as you can without the JVM uncompressing pointers and using
more memory.

Try using mmapfs instead of niofs. That'll free up a chunk of your heap

  • instead of loading file contents onto the heap, it can access that
    data directly from the kernel file cache.

We got some es consulting, which was amazing, we are now looking for
production subscription, but we really need to get this a bit more
stable.

Good to hear :slight_smile:

I'm 100% sure that the blame is on us, we are just running bad
queries, I only wish I had the time to look into it, but as in most
companies management only cares about design when all hell breaks
loose like now. Now every manager wants to get a new machine, or we
should dedicate more time on designing queries, but before that, is
just: "Can you deliver this feature this afternoon?"

Hopefully the tips above will help you to get some stability quickly

clint

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

Hi Cliton, thanks a lot for your help. I don't want to celebrate yet, but
seems that now, the cluster is way more stable. I had no idea (missed this
on docs) that ES actually used an unbounded threadpool. Seems that the huge
number of threads was causing problems with OS context switching. We can
see some rejected searches now on the threadpools, but at least, its not
bringing nodes to its knees. We may have time to look into our queries now
:slight_smile:

Is niofs the default on ES or is it simple? Just out of curiosity, I could
not find an endpoint that would give me the fs strategy used.

Regards

On Tuesday, March 19, 2013 12:02:23 PM UTC-4, Clinton Gormley wrote:

Hiya

On Tue, 2013-03-19 at 07:21 -0700, Vinicius Carvalho wrote:

Hi Clinton, yeah we are having some problems here. We narrow to some
nasty queries that frontend started to execute, it seems that those
queries (some count queries) were ok, but the moment we started having
more and more of them (they had a spike in numbers) we started seen
nodes getting the threadpool (search) number spike to several
thousand, this node would then get removed from the cluster (I'm
assuming it would be unresponsive), after that it's a snowball, the
shards starts to be relocated, IO gets high, another node threadpool
search goes crazy high in number, and again the cluster gets removed,
we are left with 2 nodes only, and since we have zen.discovery = 3,
most of the time, we are left without a master (don't know how bad
that is)... final result : whole cluster restart.

yeah, you want to avoid that :slight_smile:

I'd start with configuring the various threadpools (eg search) to
prevent thread explosions. By default, ES will just keep trying to
create new threads. It is quite difficult to come up with good defaults
for threadpools, which is why it is currently configured this way, but
with knowledge of your system, you should be able to come up with the
"right" numbers for fixed threadpools and queue size.

That at least should stop your cluster falling over.

Then why you are having problems with queries. The usual issue here is
the amount of RAM you have available. However, you've already got your
heap as big as you can without the JVM uncompressing pointers and using
more memory.

Try using mmapfs instead of niofs. That'll free up a chunk of your heap

  • instead of loading file contents onto the heap, it can access that
    data directly from the kernel file cache.

We got some es consulting, which was amazing, we are now looking for
production subscription, but we really need to get this a bit more
stable.

Good to hear :slight_smile:

I'm 100% sure that the blame is on us, we are just running bad
queries, I only wish I had the time to look into it, but as in most
companies management only cares about design when all hell breaks
loose like now. Now every manager wants to get a new machine, or we
should dedicate more time on designing queries, but before that, is
just: "Can you deliver this feature this afternoon?"

Hopefully the tips above will help you to get some stability quickly

clint

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

On Tue, 2013-03-19 at 10:59 -0700, Vinicius Carvalho wrote:

Hi Cliton, thanks a lot for your help. I don't want to celebrate yet,
but seems that now, the cluster is way more stable. I had no idea
(missed this on docs) that ES actually used an unbounded threadpool.
Seems that the huge number of threads was causing problems with OS
context switching. We can see some rejected searches now on the
threadpools, but at least, its not bringing nodes to its knees. We may
have time to look into our queries now :slight_smile:

good to hear :slight_smile:

Is niofs the default on ES or is it simple? Just out of curiosity, I
could not find an endpoint that would give me the fs strategy used.

Best reference is here:
https://github.com/elasticsearch/elasticsearch/blob/master/src/main/java/org/elasticsearch/index/store/IndexStoreModule.java#L57

clint

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.