Suggestion for a Test ELK Deployment ( Best Performance/ Load Balancing / Security / HA)


(stefano ruggiero) #1

Hi all,

i would like to start this conversation to discuss about the best
architecture of ELK based on our hardware and needed for a test envirorment.

What we have:

  • 4+ ES nodes
  • x2 with 24 gb of rams and 800 gb of HD SAS 4 x2 CPU
    • x2 with 16 gb of rams and 500 gb of HD SAS 4 x2 CPU
  • 10+ LS Collectors
  • 2+ Kibana instances

we have 2 separate Datacentre infact, as i show, we have the specular
resources on the above list, so for example we have 2 ES nodes on the first
location and the other 2 in the second location that are linked with double
redundant fiber 10 gbit .

Our test is to understand how ELK stack performing with indexing of all
Application and Server Events, so we are talking about 200 Events for
seconds in the test lab. We would like to have a retention of 2 or 3
mounth, so seraching with kibana that logs, and then close and backup old
index that we test is a working well with curator plugin.

What is the best configuration for Load balancing events across the two
locatio
n i mean every collectors should have 2 available choice for the
output in case of one node go down or is performing bad , what do you
suggests ?
we try Nginx with health check but i think that ES should do something
similar for load balancing indexing process with a node master false, data
false , even if we raed in the community that this type of node is reserved
for balancing search and not indexing that go every time across the master
of the cluster, am i right?

What is the best configuration that you test ? i mean how many shards how
many replicas for a full High availability and redundant solution ?
we try to play with 2 shard and one replica for 4 data node, because as we
see replcas are involved in search process so it can be a good solution to
reserve some nodes only for replicas but what we miss is if a node go down
or a datacentre died can we have all data automatically on the other side
(just with replicas) ? ( we know that for the golden rule we need to have 5
nodes and 3 minimum master node for a cluster so if we have only 2 DC could
be critical because one DC need to have more nodes and become the leader of
the all cluster... )

*What is your best configuration for a security prospective ? *
we test nginx also as reverse proxy with standard autentcation to prevent
unwanted DELETE and PUT but we are looking for a more strong solution with
more flexibility and roles/premissions configuration like a standard SQL
DB. Our network layer is really strong every ELK layer has his own DMZ, ACL
and firewall rule

iam worried about espacially on the ES configuration like shards replica
and load balancing i think that this conversation should be helpfull for a
very large community auditor that have some doubts about ES and ELK stack
in general.

Best Regards,
Stefano

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/d3ca4aaf-a1ee-4732-9ae5-629dd8198e7b%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


(Mark Walkom) #2

As you may have read, ES is latency sensitive and so having a cluster
across your DCs isn't recommended.
You may want to look at tribe nodes and then have two separate clusters,
that way you get around your problems with wanting all data available in
both DCs and also cross DC load balancing.

Around shard counts, to ensure you balance load you ideally want one shard
per node, then create replicas based on what you require. Trying to setup
replica only nodes isn't worth the trouble though.

Security wise, the base setup you have is good but you may want to have a
look at some of the community baed solutions to ACLs if that's what you
want.

Regards,
Mark Walkom

Infrastructure Engineer
Campaign Monitor
email: markw@campaignmonitor.com
web: www.campaignmonitor.com

On 12 July 2014 21:02, Stefano Ruggiero stefano.secure@gmail.com wrote:

Hi all,

i would like to start this conversation to discuss about the best
architecture of ELK based on our hardware and needed for a test envirorment.

What we have:

  • 4+ ES nodes
  • x2 with 24 gb of rams and 800 gb of HD SAS 4 x2 CPU
    • x2 with 16 gb of rams and 500 gb of HD SAS 4 x2 CPU
  • 10+ LS Collectors
  • 2+ Kibana instances

we have 2 separate Datacentre infact, as i show, we have the specular
resources on the above list, so for example we have 2 ES nodes on the first
location and the other 2 in the second location that are linked with double
redundant fiber 10 gbit .

Our test is to understand how ELK stack performing with indexing of all
Application and Server Events, so we are talking about 200 Events for
seconds in the test lab. We would like to have a retention of 2 or 3
mounth, so seraching with kibana that logs, and then close and backup old
index that we test is a working well with curator plugin.

What is the best configuration for Load balancing events across the two
locatio
n i mean every collectors should have 2 available choice for the
output in case of one node go down or is performing bad , what do you
suggests ?
we try Nginx with health check but i think that ES should do something
similar for load balancing indexing process with a node master false, data
false , even if we raed in the community that this type of node is reserved
for balancing search and not indexing that go every time across the master
of the cluster, am i right?

What is the best configuration that you test ? i mean how many shards
how many replicas for a full High availability and redundant solution ?
we try to play with 2 shard and one replica for 4 data node, because as we
see replcas are involved in search process so it can be a good solution to
reserve some nodes only for replicas but what we miss is if a node go down
or a datacentre died can we have all data automatically on the other side
(just with replicas) ? ( we know that for the golden rule we need to have 5
nodes and 3 minimum master node for a cluster so if we have only 2 DC could
be critical because one DC need to have more nodes and become the leader of
the all cluster... )

*What is your best configuration for a security prospective ? *
we test nginx also as reverse proxy with standard autentcation to prevent
unwanted DELETE and PUT but we are looking for a more strong solution with
more flexibility and roles/premissions configuration like a standard SQL
DB. Our network layer is really strong every ELK layer has his own DMZ, ACL
and firewall rule

iam worried about espacially on the ES configuration like shards replica
and load balancing i think that this conversation should be helpfull for a
very large community auditor that have some doubts about ES and ELK stack
in general.

Best Regards,
Stefano

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/d3ca4aaf-a1ee-4732-9ae5-629dd8198e7b%40googlegroups.com
https://groups.google.com/d/msgid/elasticsearch/d3ca4aaf-a1ee-4732-9ae5-629dd8198e7b%40googlegroups.com?utm_medium=email&utm_source=footer
.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAEM624YB72dN5oiCs6%2B_m%2BraJAUW1LsihLn_kVdJSSWtMDbC%2BQ%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


(stefano ruggiero) #3

Thanks alot for the answer, you confirm my worries about a cross datacenter
cluster, even if i think that ES has been build also for this type of
situation, the problem that i see is that with tribe nodes we cant have a
full HA across the 2 DC, even if is a good solution for searching in all DC
it isent a good solution for replicate indexed data, am i right ?

so you suggests to have 4 shard and on replica so 1 primary and 1 replica
per node ? (obv f we have installed 4 nodes in 2 DC)

what do you use for load balance indexing across nodes where one of them go
down given that logstash allow only 1 ip or domian in the output
configuration.

Regards

Il giorno domenica 13 luglio 2014 02:28:01 UTC+2, Mark Walkom ha scritto:

As you may have read, ES is latency sensitive and so having a cluster
across your DCs isn't recommended.
You may want to look at tribe nodes and then have two separate clusters,
that way you get around your problems with wanting all data available in
both DCs and also cross DC load balancing.

Around shard counts, to ensure you balance load you ideally want one shard
per node, then create replicas based on what you require. Trying to setup
replica only nodes isn't worth the trouble though.

Security wise, the base setup you have is good but you may want to have a
look at some of the community baed solutions to ACLs if that's what you
want.

Regards,
Mark Walkom

Infrastructure Engineer
Campaign Monitor
email: ma...@campaignmonitor.com <javascript:>
web: www.campaignmonitor.com

On 12 July 2014 21:02, Stefano Ruggiero <stefano...@gmail.com
<javascript:>> wrote:

Hi all,

i would like to start this conversation to discuss about the best
architecture of ELK based on our hardware and needed for a test envirorment.

What we have:

  • 4+ ES nodes
  • x2 with 24 gb of rams and 800 gb of HD SAS 4 x2 CPU
    • x2 with 16 gb of rams and 500 gb of HD SAS 4 x2 CPU
  • 10+ LS Collectors
  • 2+ Kibana instances

we have 2 separate Datacentre infact, as i show, we have the specular
resources on the above list, so for example we have 2 ES nodes on the first
location and the other 2 in the second location that are linked with double
redundant fiber 10 gbit .

Our test is to understand how ELK stack performing with indexing of all
Application and Server Events, so we are talking about 200 Events for
seconds in the test lab. We would like to have a retention of 2 or 3
mounth, so seraching with kibana that logs, and then close and backup old
index that we test is a working well with curator plugin.

What is the best configuration for Load balancing events across the two
locatio
n i mean every collectors should have 2 available choice for the
output in case of one node go down or is performing bad , what do you
suggests ?
we try Nginx with health check but i think that ES should do something
similar for load balancing indexing process with a node master false, data
false , even if we raed in the community that this type of node is reserved
for balancing search and not indexing that go every time across the master
of the cluster, am i right?

What is the best configuration that you test ? i mean how many shards
how many replicas for a full High availability and redundant solution ?
we try to play with 2 shard and one replica for 4 data node, because as
we see replcas are involved in search process so it can be a good solution
to reserve some nodes only for replicas but what we miss is if a node go
down or a datacentre died can we have all data automatically on the other
side (just with replicas) ? ( we know that for the golden rule we need to
have 5 nodes and 3 minimum master node for a cluster so if we have only 2
DC could be critical because one DC need to have more nodes and become the
leader of the all cluster... )

*What is your best configuration for a security prospective ? *
we test nginx also as reverse proxy with standard autentcation to prevent
unwanted DELETE and PUT but we are looking for a more strong solution with
more flexibility and roles/premissions configuration like a standard SQL
DB. Our network layer is really strong every ELK layer has his own DMZ, ACL
and firewall rule

iam worried about espacially on the ES configuration like shards replica
and load balancing i think that this conversation should be helpfull for a
very large community auditor that have some doubts about ES and ELK stack
in general.

Best Regards,
Stefano

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearc...@googlegroups.com <javascript:>.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/d3ca4aaf-a1ee-4732-9ae5-629dd8198e7b%40googlegroups.com
https://groups.google.com/d/msgid/elasticsearch/d3ca4aaf-a1ee-4732-9ae5-629dd8198e7b%40googlegroups.com?utm_medium=email&utm_source=footer
.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/5fb8cb53-cace-418d-b848-aef721577f92%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


(Mark Walkom) #4

ES was definitely not built for cross DC.
You can aim for cross DC redundancy with snapshot+restore functionality
though.

Regarding shards, it's best practise to have one primary shard per node.
Whatever replicas you setup is more of a personal choice, the more replicas
you have the more redundancy and search response throughput, but also the
more storage space and memory you use.

Regarding load balancing/redundancy for LS, there are a few options, you
can look at HAProxy or Zookeeper for example, you could also try using an
anycast endpoint. This is a bit more of an open solution based on your
requirements and systems.

Regards,
Mark Walkom

Infrastructure Engineer
Campaign Monitor
email: markw@campaignmonitor.com
web: www.campaignmonitor.com

On 13 July 2014 16:31, Stefano Ruggiero stefano.secure@gmail.com wrote:

Thanks alot for the answer, you confirm my worries about a cross
datacenter cluster, even if i think that ES has been build also for this
type of situation, the problem that i see is that with tribe nodes we cant
have a full HA across the 2 DC, even if is a good solution for searching in
all DC it isent a good solution for replicate indexed data, am i right ?

so you suggests to have 4 shard and on replica so 1 primary and 1 replica
per node ? (obv f we have installed 4 nodes in 2 DC)

what do you use for load balance indexing across nodes where one of them
go down given that logstash allow only 1 ip or domian in the output
configuration.

Regards

Il giorno domenica 13 luglio 2014 02:28:01 UTC+2, Mark Walkom ha scritto:

As you may have read, ES is latency sensitive and so having a cluster
across your DCs isn't recommended.
You may want to look at tribe nodes and then have two separate clusters,
that way you get around your problems with wanting all data available in
both DCs and also cross DC load balancing.

Around shard counts, to ensure you balance load you ideally want one
shard per node, then create replicas based on what you require. Trying to
setup replica only nodes isn't worth the trouble though.

Security wise, the base setup you have is good but you may want to have a
look at some of the community baed solutions to ACLs if that's what you
want.

Regards,
Mark Walkom

Infrastructure Engineer
Campaign Monitor
email: ma...@campaignmonitor.com
web: www.campaignmonitor.com

On 12 July 2014 21:02, Stefano Ruggiero stefano...@gmail.com wrote:

Hi all,

i would like to start this conversation to discuss about the best
architecture of ELK based on our hardware and needed for a test envirorment.

What we have:

  • 4+ ES nodes
  • x2 with 24 gb of rams and 800 gb of HD SAS 4 x2 CPU
    • x2 with 16 gb of rams and 500 gb of HD SAS 4 x2 CPU
  • 10+ LS Collectors
  • 2+ Kibana instances

we have 2 separate Datacentre infact, as i show, we have the specular
resources on the above list, so for example we have 2 ES nodes on the first
location and the other 2 in the second location that are linked with double
redundant fiber 10 gbit .

Our test is to understand how ELK stack performing with indexing of all
Application and Server Events, so we are talking about 200 Events for
seconds in the test lab. We would like to have a retention of 2 or 3
mounth, so seraching with kibana that logs, and then close and backup old
index that we test is a working well with curator plugin.

What is the best configuration for Load balancing events across the
two locatio
n i mean every collectors should have 2 available choice
for the output in case of one node go down or is performing bad , what do
you suggests ?
we try Nginx with health check but i think that ES should do something
similar for load balancing indexing process with a node master false, data
false , even if we raed in the community that this type of node is reserved
for balancing search and not indexing that go every time across the master
of the cluster, am i right?

What is the best configuration that you test ? i mean how many shards
how many replicas for a full High availability and redundant solution ?
we try to play with 2 shard and one replica for 4 data node, because as
we see replcas are involved in search process so it can be a good solution
to reserve some nodes only for replicas but what we miss is if a node go
down or a datacentre died can we have all data automatically on the other
side (just with replicas) ? ( we know that for the golden rule we need to
have 5 nodes and 3 minimum master node for a cluster so if we have only 2
DC could be critical because one DC need to have more nodes and become the
leader of the all cluster... )

*What is your best configuration for a security prospective ? *
we test nginx also as reverse proxy with standard autentcation to
prevent unwanted DELETE and PUT but we are looking for a more strong
solution with more flexibility and roles/premissions configuration like a
standard SQL DB. Our network layer is really strong every ELK layer has his
own DMZ, ACL and firewall rule

iam worried about espacially on the ES configuration like shards replica
and load balancing i think that this conversation should be helpfull for a
very large community auditor that have some doubts about ES and ELK stack
in general.

Best Regards,
Stefano

--
You received this message because you are subscribed to the Google
Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send
an email to elasticsearc...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/
msgid/elasticsearch/d3ca4aaf-a1ee-4732-9ae5-629dd8198e7b%
40googlegroups.com
https://groups.google.com/d/msgid/elasticsearch/d3ca4aaf-a1ee-4732-9ae5-629dd8198e7b%40googlegroups.com?utm_medium=email&utm_source=footer
.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/5fb8cb53-cace-418d-b848-aef721577f92%40googlegroups.com
https://groups.google.com/d/msgid/elasticsearch/5fb8cb53-cace-418d-b848-aef721577f92%40googlegroups.com?utm_medium=email&utm_source=footer
.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAEM624Yqn18_5-3cbA_%3DNRedQtcvWe28spbwmTeWgsbwWk_zDQ%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


(sales) #5

Stefano,
Looks like you are on to a good start with this. I would like to let you
know that my company is now providing ELK stack solutions as a package
including everything that you are looking for. Complete HA, Load balanced
and etc. Please let me know if you are ever in need of any assistance in
your deployment or anything from an implementation perspective. These
solutions are also backed by Elasticsearch support if a support agreement
is signed with them.

http://monster-solutions.net/2014/07/03/elk-stack-logging-solutions-now-available/

On Saturday, July 12, 2014 7:02:43 AM UTC-4, Stefano Ruggiero wrote:

Hi all,

i would like to start this conversation to discuss about the best
architecture of ELK based on our hardware and needed for a test envirorment.

What we have:

  • 4+ ES nodes
  • x2 with 24 gb of rams and 800 gb of HD SAS 4 x2 CPU
    • x2 with 16 gb of rams and 500 gb of HD SAS 4 x2 CPU
  • 10+ LS Collectors
  • 2+ Kibana instances

we have 2 separate Datacentre infact, as i show, we have the specular
resources on the above list, so for example we have 2 ES nodes on the first
location and the other 2 in the second location that are linked with double
redundant fiber 10 gbit .

Our test is to understand how ELK stack performing with indexing of all
Application and Server Events, so we are talking about 200 Events for
seconds in the test lab. We would like to have a retention of 2 or 3
mounth, so seraching with kibana that logs, and then close and backup old
index that we test is a working well with curator plugin.

What is the best configuration for Load balancing events across the two
locatio
n i mean every collectors should have 2 available choice for the
output in case of one node go down or is performing bad , what do you
suggests ?
we try Nginx with health check but i think that ES should do something
similar for load balancing indexing process with a node master false, data
false , even if we raed in the community that this type of node is reserved
for balancing search and not indexing that go every time across the master
of the cluster, am i right?

What is the best configuration that you test ? i mean how many shards
how many replicas for a full High availability and redundant solution ?
we try to play with 2 shard and one replica for 4 data node, because as we
see replcas are involved in search process so it can be a good solution to
reserve some nodes only for replicas but what we miss is if a node go down
or a datacentre died can we have all data automatically on the other side
(just with replicas) ? ( we know that for the golden rule we need to have 5
nodes and 3 minimum master node for a cluster so if we have only 2 DC could
be critical because one DC need to have more nodes and become the leader of
the all cluster... )

*What is your best configuration for a security prospective ? *
we test nginx also as reverse proxy with standard autentcation to prevent
unwanted DELETE and PUT but we are looking for a more strong solution with
more flexibility and roles/premissions configuration like a standard SQL
DB. Our network layer is really strong every ELK layer has his own DMZ, ACL
and firewall rule

iam worried about espacially on the ES configuration like shards replica
and load balancing i think that this conversation should be helpfull for a
very large community auditor that have some doubts about ES and ELK stack
in general.

Best Regards,
Stefano

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/6b8a14ad-98b9-4c5f-ab86-fe2c1fc6d6f6%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


(system) #6