Load balancing and a node with no primary shard

Hi,

We have a system based on Elasticsearch and we're performing volume tests
in order to come up with hardware requirements.

Following the tests, I am confused by the role of the primary shard in a
load balanced environment and am wondering regarding performance after a
node failure. I also have issues with ES_MAX_MEM.

Can you please help?

In my lab I have two data nodes and one client node (non-data), configured
for two shards and one replica. My client is sending Index requests to the
client node. ES_MAX_MEM=ES_MIN_MEM=8g and eace node has 16GB RAM. In the
volume tests we sent a steady stream of messages to ES (12,500 messages
every 5 seconds).

During testing I closed one of the data nodes and as a result the other
node showed it's two shards are primary shards.
When the first node came back up all turned green, but the second node
still held both primary shards.
The other thing that happened is the elasticseach memory consumption
constantly grew until it consumed the entire 16GB RAM.

  1. Shouldn't adding a new node result in splitting primary shards
    between all nodes? How come one of the shards in the first node did not
    become a primary share?
  2. Since the client node is load balancing, what happens when it sends
    data to the node which has no primary shard? Does that impact performance?
  3. How come ES consumed 16GBRAM when ES_MAX_MEM is set to 8g?

Thanks,
Guy

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/1de23735-5b44-49b3-8e2a-96ecdc362f14%40googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

  1. What you observe is correct. Replica shards can be instantly promoted to
    primary shards. Primary shards stay as long as the node is running. The
    reason is, there is no functional difference between a primary and replica.
    You should not worry about primary shards at all, they do not matter for
    load balancing.

  2. Each node is able to route any index request, over the whole cluster,
    whether it holds primary shards or not. All nodes that hold shards of an
    index will receive index requests. The primary first, the others next. Only
    the order is different, not the resulting indexing load.

  3. ES_MAX_MEM is not limiting the maximum memory of the process. It is the
    maximum heap memory for the JVM. Beside the heap, the JVM allocates other
    memory spaces, direct buffer memory, or caches.

Jörg

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAKdsXoHo09zMa%3Dj%3DoMgE4FRkR7_8pacDGH%3D5xNU3JvtdFYxawg%40mail.gmail.com.
For more options, visit https://groups.google.com/groups/opt_out.