Redesigning ES Cluster, questions about optimization

Harry_Truman · February 10, 2014, 11:03pm

I recently finished deploying an ES/Logstash cluster for a small
environment. It's a two-node local cluster but I'm getting horrible
performance and frequent crashes. I'll soon be standing up a cluster in
another environment that's roughly 10x the size of this first one so I've
got to figure out how to better optimize the clusters. Here's the current
resource and indexing stats:
8 CPU, 64GB RAM, 1TB storage
140 log sources, 21 indexed fields, ~35,000 message per minute, ~60GB/day

At present, index/search performance is awful. If I search for any time
period that's larger than a day or so, ES will usually crash and require a
manual restart. I'm going to be standing up a second dedicated ES system
and I'll be configuring one host for indexing and the second for searching.
I'll also be enabling the mmapfs store, disabling the _all field, and
disabling storing and/or indexing on some of the fields. At least that's my
plan so far -- it makes sense in my head but I'm not sure if it will
actually be the most efficient solution.

If I'm doing something stupid or if anybody has other recommendations, do
tell!

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/e12e8578-d32a-4e1b-b74d-b1e0073dfa14%40googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

warkolm · February 10, 2014, 11:08pm

What's the total amount of indexed data (gb and count)? What about your
heap size, shard count, replica count, ES and java versions?

Regards,
Mark Walkom

Infrastructure Engineer
Campaign Monitor
email: markw@campaignmonitor.com
web: www.campaignmonitor.com

On 11 February 2014 10:03, Harry Truman landoman@gmail.com wrote:

I recently finished deploying an ES/Logstash cluster for a small
environment. It's a two-node local cluster but I'm getting horrible
performance and frequent crashes. I'll soon be standing up a cluster in
another environment that's roughly 10x the size of this first one so I've
got to figure out how to better optimize the clusters. Here's the current
resource and indexing stats:
8 CPU, 64GB RAM, 1TB storage
140 log sources, 21 indexed fields, ~35,000 message per minute, ~60GB/day

At present, index/search performance is awful. If I search for any time
period that's larger than a day or so, ES will usually crash and require a
manual restart. I'm going to be standing up a second dedicated ES system
and I'll be configuring one host for indexing and the second for searching.
I'll also be enabling the mmapfs store, disabling the _all field, and
disabling storing and/or indexing on some of the fields. At least that's my
plan so far -- it makes sense in my head but I'm not sure if it will
actually be the most efficient solution.

If I'm doing something stupid or if anybody has other recommendations, do
tell!

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/e12e8578-d32a-4e1b-b74d-b1e0073dfa14%40googlegroups.com
.
For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAEM624ZDUzx0uDG4gja7G91PWJszZ6rVjkrm5G75VJNxHU8TCA%40mail.gmail.com.
For more options, visit https://groups.google.com/groups/opt_out.

Harry_Truman · February 11, 2014, 6:00pm

As of the time of this posting:

elasticsearch-0.90.9-1
jdk-1.7.0_51

ES_HEAP_SIZE=12g
ES_DIRECT_SIZE=20g
index.number_of_replicas: 1

Shards:
"number_of_nodes" : 2,
"number_of_data_nodes" : 2,
"active_primary_shards" : 30,
"active_shards" : 60,

And rather than a block of text, here are the current indexes:

https://lh6.googleusercontent.com/-oHwKVsVYhJ4/UvpaSWCWPUI/AAAAAAAABjg/Dn3Mz3VNAMk/s1600/es.png

On Monday, February 10, 2014 3:08:22 PM UTC-8, Mark Walkom wrote:

What's the total amount of indexed data (gb and count)? What about your
heap size, shard count, replica count, ES and java versions?

Regards,
Mark Walkom

Infrastructure Engineer
Campaign Monitor
email: ma...@campaignmonitor.com <javascript:>
web: www.campaignmonitor.com

On 11 February 2014 10:03, Harry Truman <land...@gmail.com <javascript:>>wrote:

I recently finished deploying an ES/Logstash cluster for a small
environment. It's a two-node local cluster but I'm getting horrible
performance and frequent crashes. I'll soon be standing up a cluster in
another environment that's roughly 10x the size of this first one so I've
got to figure out how to better optimize the clusters. Here's the current
resource and indexing stats:
8 CPU, 64GB RAM, 1TB storage
140 log sources, 21 indexed fields, ~35,000 message per minute, ~60GB/day

At present, index/search performance is awful. If I search for any time
period that's larger than a day or so, ES will usually crash and require a
manual restart. I'm going to be standing up a second dedicated ES system
and I'll be configuring one host for indexing and the second for searching.
I'll also be enabling the mmapfs store, disabling the _all field, and
disabling storing and/or indexing on some of the fields. At least that's my
plan so far -- it makes sense in my head but I'm not sure if it will
actually be the most efficient solution.

If I'm doing something stupid or if anybody has other recommendations, do
tell!

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearc...@googlegroups.com <javascript:>.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/e12e8578-d32a-4e1b-b74d-b1e0073dfa14%40googlegroups.com
.
For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/11498c9b-2c21-491d-a571-a64a76d3954a%40googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

jprante · February 11, 2014, 6:17pm

You write "ES will usually crash" - but how does it crash? Are there
messages in the log?

Do not use Java 7u51, it may cause trouble, 7u25 is known to be stable.

Why do you only use 12G heap if you have 64G RAM on a node? Why do you
limit your resources with ES_DIRECT_SIZE? Why do you use 5 shards per index
instead of 1 if you have 2 nodes?

With 0.90.11, Java 7u25, mmapfs, mlockall, 16G or 20G heap, 1 shard / 1
replica per index for 2 nodes, and unlimited ES_DIRECT_SIZE your system
will work better.

Jörg

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAKdsXoE1aFbLGEJRDvR59nnH4w3J2hKw0kvTfsKcNML448etBA%40mail.gmail.com.
For more options, visit https://groups.google.com/groups/opt_out.