Best practice - architecture feedback/opinion needed

I am evaluating ELK for the past 2 weeks in a testing environment, and i am
very pleased with the result.

right now i want to move it to staging, so i want to make sure i have the
best practice/advised setup which i hope can get your feedback/opinion about

expected usage:

  • up to 20 GB of logs are sent from logstash to elasticsearch every day
    (continuously 24/7)
  • 15 days worth of data should be stored in elasticsearch for
    search/graph.
  • logs older than 15 days should be be deleted
  • Daily incremental backup to AWS-S3
  • 7 kibana users with average of 9 graphs per page/saved templates.
    always on 9/7
  • 1 kibana user with no graph, just a live "tail" of specific types. 24/7
  • Cronjobs curls directly to elasticsearch to perform different tasks
    (these are negligible )

I am considering the below setup, please correct me if i am wrong:

  • First Option:

Logstash -> Load balancer - >

  • Elasticsearch instance (4 GB ram/2 cores)

  • Elasticsearch instance (4 GB ram/2 cores)

  • Elasticsearch instance (4 GB ram/2 cores)

    • Second option:

Logstash - >- Elasticsearch instance (4 GB ram/2 cores) # Master
Elasticsearch instance (4 GB ram/2 cores) #auto discovered by master

  • Third Option

Logstash -> Elasticsearch (15 GB ram/4 cores) (Still haven't figured out
how to solve the yellow status color)

Any advice on point of failures,in the above setup would be greatly
appreciated.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/a7c881a3-fb54-4c5c-b497-1a69becbe6a8%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

With one node you can never allocate replicas, which is why it's in a
yellow state.

I'd go with setup #1 personally, but you probably want more RAM, say 4GB
heap. Then set all to be master eligible so that you get some level of
protection against node loss.

Make sure you also use curator for managing retention and snapshot+restore.

On 15 February 2015 at 04:01, dna lor dnalor.ah@gmail.com wrote:

I am evaluating ELK for the past 2 weeks in a testing environment, and i
am very pleased with the result.

right now i want to move it to staging, so i want to make sure i have the
best practice/advised setup which i hope can get your feedback/opinion about

expected usage:

  • up to 20 GB of logs are sent from logstash to elasticsearch every
    day (continuously 24/7)
  • 15 days worth of data should be stored in elasticsearch for
    search/graph.
  • logs older than 15 days should be be deleted
  • Daily incremental backup to AWS-S3
  • 7 kibana users with average of 9 graphs per page/saved templates.
    always on 9/7
  • 1 kibana user with no graph, just a live "tail" of specific types.
    24/7
  • Cronjobs curls directly to elasticsearch to perform different tasks
    (these are negligible )

I am considering the below setup, please correct me if i am wrong:

  • First Option:

Logstash -> Load balancer - >

  • Elasticsearch instance (4 GB ram/2 cores)

  • Elasticsearch instance (4 GB ram/2 cores)

  • Elasticsearch instance (4 GB ram/2 cores)

    • Second option:

Logstash - >- Elasticsearch instance (4 GB ram/2 cores) # Master
Elasticsearch instance (4 GB ram/2 cores) #auto discovered by master

  • Third Option

Logstash -> Elasticsearch (15 GB ram/4 cores) (Still haven't figured out
how to solve the yellow status color)

Any advice on point of failures,in the above setup would be greatly
appreciated.

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/a7c881a3-fb54-4c5c-b497-1a69becbe6a8%40googlegroups.com
https://groups.google.com/d/msgid/elasticsearch/a7c881a3-fb54-4c5c-b497-1a69becbe6a8%40googlegroups.com?utm_medium=email&utm_source=footer
.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAEYi1X8CqYCvts1V61RFJsEAJikyk%2BWmdprKuWWKyYJrG-S0Tw%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

Thanks for your feedback Mark,

So option 1 is preferable, can i get clarification on the points you have
raised just so i understand things properly ?

  1. Increase ram heap to 4 GB; instead of using 3 instances, i can switch
    to using 3 instances with 7.5 each, and assign 6 GB heap on both. would
    that be the same ?
  2. All ES instances will be set as master eligible; means there will be
    identical data on all instances, correct?
    1. In this case, i set my load balancer to use round robin on all
      related ES and they will sync in between ?

Thanks for helping out.

On Monday, February 16, 2015 at 9:38:11 AM UTC+2, Mark Walkom wrote:

With one node you can never allocate replicas, which is why it's in a
yellow state.

I'd go with setup #1 personally, but you probably want more RAM, say 4GB
heap. Then set all to be master eligible so that you get some level of
protection against node loss.

Make sure you also use curator for managing retention and snapshot+restore.

On 15 February 2015 at 04:01, dna lor <dnal...@gmail.com <javascript:>>
wrote:

I am evaluating ELK for the past 2 weeks in a testing environment, and i
am very pleased with the result.

right now i want to move it to staging, so i want to make sure i have the
best practice/advised setup which i hope can get your feedback/opinion about

expected usage:

  • up to 20 GB of logs are sent from logstash to elasticsearch every
    day (continuously 24/7)
  • 15 days worth of data should be stored in elasticsearch for
    search/graph.
  • logs older than 15 days should be be deleted
  • Daily incremental backup to AWS-S3
  • 7 kibana users with average of 9 graphs per page/saved templates.
    always on 9/7
  • 1 kibana user with no graph, just a live "tail" of specific types.
    24/7
  • Cronjobs curls directly to elasticsearch to perform different tasks
    (these are negligible )

I am considering the below setup, please correct me if i am wrong:

  • First Option:

Logstash -> Load balancer - >

  • Elasticsearch instance (4 GB ram/2 cores)

  • Elasticsearch instance (4 GB ram/2 cores)

  • Elasticsearch instance (4 GB ram/2 cores)

    • Second option:

Logstash - >- Elasticsearch instance (4 GB ram/2 cores) # Master
Elasticsearch instance (4 GB ram/2 cores) #auto discovered by master

  • Third Option

Logstash -> Elasticsearch (15 GB ram/4 cores) (Still haven't figured out
how to solve the yellow status color)

Any advice on point of failures,in the above setup would be greatly
appreciated.

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearc...@googlegroups.com <javascript:>.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/a7c881a3-fb54-4c5c-b497-1a69becbe6a8%40googlegroups.com
https://groups.google.com/d/msgid/elasticsearch/a7c881a3-fb54-4c5c-b497-1a69becbe6a8%40googlegroups.com?utm_medium=email&utm_source=footer
.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/b0ed0756-c625-46ea-a541-0ffd3c6df213%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

1 - You want to have 50% system memory for heap, the other 50% for caching.
So 4GB heap can be done with 7.5GB in system, but you don't really want to
go higher.
2 - No. See
http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/modules-discovery-zen.html#master-election

But if all nodes are in the same cluster then you can round robin and it
will shard the data between them.

On 17 February 2015 at 17:38, dna lor dnalor.ah@gmail.com wrote:

Thanks for your feedback Mark,

So option 1 is preferable, can i get clarification on the points you have
raised just so i understand things properly ?

  1. Increase ram heap to 4 GB; instead of using 3 instances, i can
    switch to using 3 instances with 7.5 each, and assign 6 GB heap on both.
    would that be the same ?
  2. All ES instances will be set as master eligible; means there will
    be identical data on all instances, correct?
    1. In this case, i set my load balancer to use round robin on all
      related ES and they will sync in between ?

Thanks for helping out.

On Monday, February 16, 2015 at 9:38:11 AM UTC+2, Mark Walkom wrote:

With one node you can never allocate replicas, which is why it's in a
yellow state.

I'd go with setup #1 personally, but you probably want more RAM, say 4GB
heap. Then set all to be master eligible so that you get some level of
protection against node loss.

Make sure you also use curator for managing retention and
snapshot+restore.

On 15 February 2015 at 04:01, dna lor dnal...@gmail.com wrote:

I am evaluating ELK for the past 2 weeks in a testing environment, and i
am very pleased with the result.

right now i want to move it to staging, so i want to make sure i have
the best practice/advised setup which i hope can get your feedback/opinion
about

expected usage:

  • up to 20 GB of logs are sent from logstash to elasticsearch every
    day (continuously 24/7)
  • 15 days worth of data should be stored in elasticsearch for
    search/graph.
  • logs older than 15 days should be be deleted
  • Daily incremental backup to AWS-S3
  • 7 kibana users with average of 9 graphs per page/saved templates.
    always on 9/7
  • 1 kibana user with no graph, just a live "tail" of specific types.
    24/7
  • Cronjobs curls directly to elasticsearch to perform different
    tasks (these are negligible )

I am considering the below setup, please correct me if i am wrong:

  • First Option:

Logstash -> Load balancer - >

  • Elasticsearch instance (4 GB ram/2 cores)

  • Elasticsearch instance (4 GB ram/2 cores)

  • Elasticsearch instance (4 GB ram/2 cores)

    • Second option:

Logstash - >- Elasticsearch instance (4 GB ram/2 cores) # Master
Elasticsearch instance (4 GB ram/2 cores) #auto discovered by master

  • Third Option

Logstash -> Elasticsearch (15 GB ram/4 cores) (Still haven't figured out
how to solve the yellow status color)

Any advice on point of failures,in the above setup would be greatly
appreciated.

--
You received this message because you are subscribed to the Google
Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send
an email to elasticsearc...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/
msgid/elasticsearch/a7c881a3-fb54-4c5c-b497-1a69becbe6a8%
40googlegroups.com
https://groups.google.com/d/msgid/elasticsearch/a7c881a3-fb54-4c5c-b497-1a69becbe6a8%40googlegroups.com?utm_medium=email&utm_source=footer
.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/b0ed0756-c625-46ea-a541-0ffd3c6df213%40googlegroups.com
https://groups.google.com/d/msgid/elasticsearch/b0ed0756-c625-46ea-a541-0ffd3c6df213%40googlegroups.com?utm_medium=email&utm_source=footer
.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAEYi1X8ZoTrn1TpstNA7wgRv8qjLL65gqOUgM390Ni7G39bJDQ%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.