ES Nodes storage capacity

Hi all,

I have designed and setup an ELK stack that can be expanded horizontally (i
hope) using the following technologies:

haproxy x2 (failover) => Logstash x2 => nginx x2 (failover) =>
elasticsearch x2

The capacity of the elasticsearch nodes have come into question and this
has raised a few questions regards spec of a new node.

Here is the current specs of the elasticsearch nodes ( i have 2 of these):

· HP DL360p Gen8 10-SFF CTO Server

· 64G PC3L-12800R-11

· 8x 900G 10K SAS

· 2x 300G 10K SAS

· 2G FBWC

· Dual 750W PSU

· 4P 1GBE 331FLR

So i am currently running the OS from the 2x 300Gb in a mirror raid, and 2
data logical data drives using 4 of the 8 900GB drives (so 2 striped raids
containing 4x900GB drives).

This is all working fine but the data capacity has become an issue (14TB
total available) . I think for the moment i have enough compute power but
what would happen if i added a lower spec node (or multiple), marked them
as a data node (non-master) but with different storage capacity available.
Say for example 20TB in each.

If the original 2 nodes filed there data stores and the only store
available was the new node(s) then they would be processing the shards
alone and there would be no protection from replica's if this node went
down, no?

As an additional but not as critical at the moment:

If i eventually have the same issue with compute power, if these 2 someday
become saturated and i make another node a master node but it was half the
spec, would elasticsearch realize this and distribute the load or is this
purely down to nginx load distribution?

Thanks for any help\advice in advance.

Simon

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/8450d29f-9a1b-40c0-87e5-9fa2b18f364d%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Can you elaborate what you mean by becoming an issue?

When you add a node into the cluster it will automatically start to
reallocate shards to the new node, you can't have a node sitting there idle
and with lots of disk space free waiting for the other nodes to fill up
before being called upon.

As for nginx, it depends what you do with it. ES will spread the load
amongst the cluster automatically, but if you're using nginx as a front end
then it's up to you to factor the other nodes in.

On 6 January 2015 at 01:51, Simon Thorley simon@thenom.co.uk wrote:

Hi all,

I have designed and setup an ELK stack that can be expanded horizontally
(i hope) using the following technologies:

haproxy x2 (failover) => Logstash x2 => nginx x2 (failover) =>
elasticsearch x2

The capacity of the elasticsearch nodes have come into question and this
has raised a few questions regards spec of a new node.

Here is the current specs of the elasticsearch nodes ( i have 2 of these):

· HP DL360p Gen8 10-SFF CTO Server

· 64G PC3L-12800R-11

· 8x 900G 10K SAS

· 2x 300G 10K SAS

· 2G FBWC

· Dual 750W PSU

· 4P 1GBE 331FLR

So i am currently running the OS from the 2x 300Gb in a mirror raid, and 2
data logical data drives using 4 of the 8 900GB drives (so 2 striped raids
containing 4x900GB drives).

This is all working fine but the data capacity has become an issue (14TB
total available) . I think for the moment i have enough compute power but
what would happen if i added a lower spec node (or multiple), marked them
as a data node (non-master) but with different storage capacity available.
Say for example 20TB in each.

If the original 2 nodes filed there data stores and the only store
available was the new node(s) then they would be processing the shards
alone and there would be no protection from replica's if this node went
down, no?

As an additional but not as critical at the moment:

If i eventually have the same issue with compute power, if these 2 someday
become saturated and i make another node a master node but it was half the
spec, would elasticsearch realize this and distribute the load or is this
purely down to nginx load distribution?

Thanks for any help\advice in advance.

Simon

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/8450d29f-9a1b-40c0-87e5-9fa2b18f364d%40googlegroups.com
https://groups.google.com/d/msgid/elasticsearch/8450d29f-9a1b-40c0-87e5-9fa2b18f364d%40googlegroups.com?utm_medium=email&utm_source=footer
.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAEYi1X8QvYr9hCBgYPC4LoPOv-eTqRLh-zFMxh1%3D95H4qYu4RQ%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

It seems that some of the systems are going to be generating more logs than
we previously thought (approx 250GB a day) so 14TB won't last too long. So
we are looking into new nodes.

I think we currently have enough compute power (won't know until this goes
live) so the current nodes can be kept as masters.

So what would happen if say we added a 20TB data non-master node when the
storage use gets to 21TB(distributed over the 3 nodes so 7TB on each making
the first 2 nodes full)?

Would it then only store the data\replicas\shards on the new 3rd node as
that is the only one with space?

Thanks

On Monday, 5 January 2015 20:53:28 UTC, Mark Walkom wrote:

Can you elaborate what you mean by becoming an issue?

When you add a node into the cluster it will automatically start to
reallocate shards to the new node, you can't have a node sitting there idle
and with lots of disk space free waiting for the other nodes to fill up
before being called upon.

As for nginx, it depends what you do with it. ES will spread the load
amongst the cluster automatically, but if you're using nginx as a front end
then it's up to you to factor the other nodes in.

On 6 January 2015 at 01:51, Simon Thorley <si...@thenom.co.uk
<javascript:>> wrote:

Hi all,

I have designed and setup an ELK stack that can be expanded horizontally
(i hope) using the following technologies:

haproxy x2 (failover) => Logstash x2 => nginx x2 (failover) =>
elasticsearch x2

The capacity of the elasticsearch nodes have come into question and this
has raised a few questions regards spec of a new node.

Here is the current specs of the elasticsearch nodes ( i have 2 of these):

· HP DL360p Gen8 10-SFF CTO Server

· 64G PC3L-12800R-11

· 8x 900G 10K SAS

· 2x 300G 10K SAS

· 2G FBWC

· Dual 750W PSU

· 4P 1GBE 331FLR

So i am currently running the OS from the 2x 300Gb in a mirror raid, and
2 data logical data drives using 4 of the 8 900GB drives (so 2 striped
raids containing 4x900GB drives).

This is all working fine but the data capacity has become an issue (14TB
total available) . I think for the moment i have enough compute power but
what would happen if i added a lower spec node (or multiple), marked them
as a data node (non-master) but with different storage capacity available.
Say for example 20TB in each.

If the original 2 nodes filed there data stores and the only store
available was the new node(s) then they would be processing the shards
alone and there would be no protection from replica's if this node went
down, no?

As an additional but not as critical at the moment:

If i eventually have the same issue with compute power, if these 2
someday become saturated and i make another node a master node but it was
half the spec, would elasticsearch realize this and distribute the load
or is this purely down to nginx load distribution?

Thanks for any help\advice in advance.

Simon

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearc...@googlegroups.com <javascript:>.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/8450d29f-9a1b-40c0-87e5-9fa2b18f364d%40googlegroups.com
https://groups.google.com/d/msgid/elasticsearch/8450d29f-9a1b-40c0-87e5-9fa2b18f364d%40googlegroups.com?utm_medium=email&utm_source=footer
.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/80514a89-d0d9-4b67-84f1-c871fc604a2b%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Nope, it will try to store data on the nodes with no more space, causing
problems.

On 6 January 2015 at 20:56, Simon Thorley simon@thenom.co.uk wrote:

It seems that some of the systems are going to be generating more logs
than we previously thought (approx 250GB a day) so 14TB won't last too
long. So we are looking into new nodes.

I think we currently have enough compute power (won't know until this goes
live) so the current nodes can be kept as masters.

So what would happen if say we added a 20TB data non-master node when the
storage use gets to 21TB(distributed over the 3 nodes so 7TB on each making
the first 2 nodes full)?

Would it then only store the data\replicas\shards on the new 3rd node as
that is the only one with space?

Thanks

On Monday, 5 January 2015 20:53:28 UTC, Mark Walkom wrote:

Can you elaborate what you mean by becoming an issue?

When you add a node into the cluster it will automatically start to
reallocate shards to the new node, you can't have a node sitting there idle
and with lots of disk space free waiting for the other nodes to fill up
before being called upon.

As for nginx, it depends what you do with it. ES will spread the load
amongst the cluster automatically, but if you're using nginx as a front end
then it's up to you to factor the other nodes in.

On 6 January 2015 at 01:51, Simon Thorley si...@thenom.co.uk wrote:

Hi all,

I have designed and setup an ELK stack that can be expanded horizontally
(i hope) using the following technologies:

haproxy x2 (failover) => Logstash x2 => nginx x2 (failover) =>
elasticsearch x2

The capacity of the elasticsearch nodes have come into question and this
has raised a few questions regards spec of a new node.

Here is the current specs of the elasticsearch nodes ( i have 2 of
these):

· HP DL360p Gen8 10-SFF CTO Server

· 64G PC3L-12800R-11

· 8x 900G 10K SAS

· 2x 300G 10K SAS

· 2G FBWC

· Dual 750W PSU

· 4P 1GBE 331FLR

So i am currently running the OS from the 2x 300Gb in a mirror raid, and
2 data logical data drives using 4 of the 8 900GB drives (so 2 striped
raids containing 4x900GB drives).

This is all working fine but the data capacity has become an issue (14TB
total available) . I think for the moment i have enough compute power but
what would happen if i added a lower spec node (or multiple), marked them
as a data node (non-master) but with different storage capacity available.
Say for example 20TB in each.

If the original 2 nodes filed there data stores and the only store
available was the new node(s) then they would be processing the shards
alone and there would be no protection from replica's if this node went
down, no?

As an additional but not as critical at the moment:

If i eventually have the same issue with compute power, if these 2
someday become saturated and i make another node a master node but it was
half the spec, would elasticsearch realize this and distribute the load
or is this purely down to nginx load distribution?

Thanks for any help\advice in advance.

Simon

--
You received this message because you are subscribed to the Google
Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send
an email to elasticsearc...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/
msgid/elasticsearch/8450d29f-9a1b-40c0-87e5-9fa2b18f364d%
40googlegroups.com
https://groups.google.com/d/msgid/elasticsearch/8450d29f-9a1b-40c0-87e5-9fa2b18f364d%40googlegroups.com?utm_medium=email&utm_source=footer
.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/80514a89-d0d9-4b67-84f1-c871fc604a2b%40googlegroups.com
https://groups.google.com/d/msgid/elasticsearch/80514a89-d0d9-4b67-84f1-c871fc604a2b%40googlegroups.com?utm_medium=email&utm_source=footer
.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAEYi1X_-oy52cVwL3PG-Y0ffHtRrJxdpPeQVkgmZCMDxbxDTWA%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.