Memory consumption and shard allocation

Hi folks,

i have a few questions about memory consumption during bulk (re-)indexing
and shard allocation.

We have a small cluster on the AWS, 3 nodes, 5 indices(5 shards & 1 replica
each = 50 active shards) and approx. 50GB overall.
Our setup:

Shard allocation:
During normal operation we see that almost all the primary shards are on
node 1 und node 2.
Node 3 has only 2 primary shards and 14 replicas. We run many facet
queries. Is it possible that all the queries are fired only on node 1/2 and
only on primary shards? Because we have 90% load on these nodes.
CPU load on node 1/2 is over 50-60% all the time, on node 3 less than 10%.
What could be wrong here?

Memory consumption:
During bulk reindexing with scan/scroll we ran into small "cluster
overload" problems, see here from node 1:

Catalina.log:

[2013-06-21 17:54:23,751][INFO ][monitor.jvm] [search.cloud.aws]
[gc][ConcurrentMarkSweep][796536][41146] duration [5.7s], collections
[1]/[6s], total [5.7s]/[2h], memory [3.4gb]->[3.2gb]/[3.9gb], all_pools
{[Code Cache] [12.1mb]->[12.1mb]/[48mb]}{[Par Eden Space]
[143.1mb]->[16.5mb]/[532.5mb]}{[Par Survivor Space]
[0b]->[0b]/[66.5mb]}{[CMS Old Gen] [3.2gb]->[3.2gb]/[3.3gb]}{[CMS Perm Gen]
[37.3mb]->[37.3mb]/[82mb]}

Nagios logs:

  • CMS Old Gen 99%(3.3GB),threadpool cache 100%(q2/c4/m4)###WARN### mem
    81%
  • threadpool cache 100%(q4/c4/m4)
  • CMS Old Gen 99%(3.3GB)###WARN### mem 80%,jvm_HeapMemoryUsage
    93%(c3.9GB/u3.7GB/m3.9GB),threadpool search 92%(q0/c451/m486)

What do you think about the logs above, especially catalina.log?
We want to upgrade from c1.xlarge(8cores, 7GB RAM) to m1.xlarge(4 cores,
15GB RAM(8GB XMX for ES)) for each node and to increase the number of
nodes. Does it make sense(Remember the high cpu load on node 1/2 with 8
cores)?

Best regards
Vadim

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

Regarding shard allocation, it could happen that some of your shards are
hotter than others. The hashing algorithm knows nothing of data locality or
your queries. You can use the Index Status API for a few metrics about the
shards, but there is not much. Try using the Allocate API to move shards
from the busy nodes to the less nodes (and vice-versa since the shard
allocator will attempt to rebalance). If there a difference? Try changing
it back.

As far as memory consumption goes, the logs indicate you need more memory.
:slight_smile: How big is your bulk load? Standard merge settings?

Cheers,

Ivan

On Mon, Jun 24, 2013 at 6:21 AM, Vadim Kisselmann v.kisselmann@gmail.comwrote:

Hi folks,

i have a few questions about memory consumption during bulk (re-)indexing
and shard allocation.

We have a small cluster on the AWS, 3 nodes, 5 indices(5 shards & 1
replica each = 50 active shards) and approx. 50GB overall.
Our setup:

Shard allocation:
During normal operation we see that almost all the primary shards are on
node 1 und node 2.
Node 3 has only 2 primary shards and 14 replicas. We run many facet
queries. Is it possible that all the queries are fired only on node 1/2 and
only on primary shards? Because we have 90% load on these nodes.
CPU load on node 1/2 is over 50-60% all the time, on node 3 less than 10%.
What could be wrong here?

Memory consumption:
During bulk reindexing with scan/scroll we ran into small "cluster
overload" problems, see here from node 1:

Catalina.log:

[2013-06-21 17:54:23,751][INFO ][monitor.jvm] [search.cloud.aws]
[gc][ConcurrentMarkSweep][796536][41146] duration [5.7s], collections
[1]/[6s], total [5.7s]/[2h], memory [3.4gb]->[3.2gb]/[3.9gb], all_pools
{[Code Cache] [12.1mb]->[12.1mb]/[48mb]}{[Par Eden Space]
[143.1mb]->[16.5mb]/[532.5mb]}{[Par Survivor Space]
[0b]->[0b]/[66.5mb]}{[CMS Old Gen] [3.2gb]->[3.2gb]/[3.3gb]}{[CMS Perm Gen]
[37.3mb]->[37.3mb]/[82mb]}

Nagios logs:

  • CMS Old Gen 99%(3.3GB),threadpool cache 100%(q2/c4/m4)###WARN### mem
    81%
  • threadpool cache 100%(q4/c4/m4)
  • CMS Old Gen 99%(3.3GB)###WARN### mem 80%,jvm_HeapMemoryUsage
    93%(c3.9GB/u3.7GB/m3.9GB),threadpool search 92%(q0/c451/m486)

What do you think about the logs above, especially catalina.log?
We want to upgrade from c1.xlarge(8cores, 7GB RAM) to m1.xlarge(4 cores,
15GB RAM(8GB XMX for ES)) for each node and to increase the number of
nodes. Does it make sense(Remember the high cpu load on node 1/2 with 8
cores)?

Best regards
Vadim

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

Hi Ivan,

thanks for your response.
Yes, we thought about the "hot shard"-problem and it seems to be so. One
index with primary shards on nodes 1/2 and only replicas on node 3 has
round about 300 transactions(updates&deletes) per shard all the time. Other
indices max 50 on each shard. Move replicas and primarys from node to node
manually solved the "high-cpu-load"issue.

Memory consumption: We will move to smaller machines(less cores) with more
ram(from 8 to 15GB). Our bulk loads are made with the
ElasticsearchExporter(100-doc bulks, 5 indices =17 million docs) and we
have default merge settings.

Cheers,
Vadim

Am Montag, 24. Juni 2013 22:46:59 UTC+2 schrieb Ivan Brusic:

Regarding shard allocation, it could happen that some of your shards are
hotter than others. The hashing algorithm knows nothing of data locality or
your queries. You can use the Index Status API for a few metrics about the
shards, but there is not much. Try using the Allocate API to move shards
from the busy nodes to the less nodes (and vice-versa since the shard
allocator will attempt to rebalance). If there a difference? Try changing
it back.

As far as memory consumption goes, the logs indicate you need more memory.
:slight_smile: How big is your bulk load? Standard merge settings?

Cheers,

Ivan

On Mon, Jun 24, 2013 at 6:21 AM, Vadim Kisselmann <v.kiss...@gmail.com<javascript:>

wrote:

Hi folks,

i have a few questions about memory consumption during bulk (re-)indexing
and shard allocation.

We have a small cluster on the AWS, 3 nodes, 5 indices(5 shards & 1
replica each = 50 active shards) and approx. 50GB overall.
Our setup:

Shard allocation:
During normal operation we see that almost all the primary shards are on
node 1 und node 2.
Node 3 has only 2 primary shards and 14 replicas. We run many facet
queries. Is it possible that all the queries are fired only on node 1/2 and
only on primary shards? Because we have 90% load on these nodes.
CPU load on node 1/2 is over 50-60% all the time, on node 3 less than
10%. What could be wrong here?

Memory consumption:
During bulk reindexing with scan/scroll we ran into small "cluster
overload" problems, see here from node 1:

Catalina.log:

[2013-06-21 17:54:23,751][INFO ][monitor.jvm] [search.cloud.aws]
[gc][ConcurrentMarkSweep][796536][41146] duration [5.7s], collections
[1]/[6s], total [5.7s]/[2h], memory [3.4gb]->[3.2gb]/[3.9gb], all_pools
{[Code Cache] [12.1mb]->[12.1mb]/[48mb]}{[Par Eden Space]
[143.1mb]->[16.5mb]/[532.5mb]}{[Par Survivor Space]
[0b]->[0b]/[66.5mb]}{[CMS Old Gen] [3.2gb]->[3.2gb]/[3.3gb]}{[CMS Perm Gen]
[37.3mb]->[37.3mb]/[82mb]}

Nagios logs:

  • CMS Old Gen 99%(3.3GB),threadpool cache 100%(q2/c4/m4)###WARN###
    mem 81%
  • threadpool cache 100%(q4/c4/m4)
  • CMS Old Gen 99%(3.3GB)###WARN### mem 80%,jvm_HeapMemoryUsage
    93%(c3.9GB/u3.7GB/m3.9GB),threadpool search 92%(q0/c451/m486)

What do you think about the logs above, especially catalina.log?
We want to upgrade from c1.xlarge(8cores, 7GB RAM) to m1.xlarge(4 cores,
15GB RAM(8GB XMX for ES)) for each node and to increase the number of
nodes. Does it make sense(Remember the high cpu load on node 1/2 with 8
cores)?

Best regards
Vadim

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearc...@googlegroups.com <javascript:>.
For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.