Hello I would like some clarification about node types and their usage.
We will have 3 client nodes and 6 data nodes. The 6 1TB data nodes can also
be masters (discovery.zen.minimum_master_nodes set to 4). We will use
Logstash and Kibana. Kibana will be used 24/7 by between a couple and
handfuls of people.
Some questions:
Should incoming Logstash write requests be sent to the cluster in
general (using the cluster setting in the elasticsearch output) or
specifically to the client nodes or to the data nodes (via load balancer)?
I am unsure what kind of node is best for handling writes.
If client nodes exist in the cluster are Kibana requests
automatically routed to them? Do I need to somehow specify to Kibana which
nodes to contact?
I have heard different information about master nodes and the
minimum_master_node setting. I've heard that you should have a odd number
of master nodes but I fail to see why the parity of the number of masters
matters as long as minimum_master_node is set to at least N/2 + 1. Does it
really need to be odd?
I have been advised that the client nodes will use huge amount of
memory (which makes sense due to the nature of the Kibana facet queries).
64GB per client node was recommended but I have no idea if that sounds
right or not. I don't have the ability to actually test it right now so any
more guidance on that would be helpful.
I'd be so grateful to hear from you even if you only know something about
one of my queries.
On Wednesday, 13 August 2014 12:10:14 UTC+1, Alex wrote:
Hello I would like some clarification about node types and their usage.
We will have 3 client nodes and 6 data nodes. The 6 1TB data nodes can
also be masters (discovery.zen.minimum_master_nodes set to 4). We will
use Logstash and Kibana. Kibana will be used 24/7 by between a couple and
handfuls of people.
Some questions:
Should incoming Logstash write requests be sent to the cluster in
general (using the cluster setting in the elasticsearch output) or
specifically to the client nodes or to the data nodes (via load balancer)?
I am unsure what kind of node is best for handling writes.
If client nodes exist in the cluster are Kibana requests
automatically routed to them? Do I need to somehow specify to Kibana which
nodes to contact?
I have heard different information about master nodes and the
minimum_master_node setting. I've heard that you should have a odd number
of master nodes but I fail to see why the parity of the number of masters
matters as long as minimum_master_node is set to at least N/2 + 1. Does it
really need to be odd?
I have been advised that the client nodes will use huge amount of
memory (which makes sense due to the nature of the Kibana facet queries).
64GB per client node was recommended but I have no idea if that sounds
right or not. I don't have the ability to actually test it right now so any
more guidance on that would be helpful.
I'd be so grateful to hear from you even if you only know something about
one of my queries.
1 - Up to you. We use the http output and then just use a round robin A
record to our 3 masters.
2 - They are routed but it makes more sense to specify.
3 - You're right, but most people only use 1 or 2 masters which is why they
get recommended to have at least 3.
4 - That sounds like a lot. We use masters that double as clients and they
only have 8GB, our use sounds similar and we don't have issues.
I wouldn't bother with 3 client only nodes to start, use them as master and
client and then if you find you are hitting memory issues due to queries
you can re-evaluate things.
On Wednesday, 13 August 2014 12:10:14 UTC+1, Alex wrote:
Hello I would like some clarification about node types and their usage.
We will have 3 client nodes and 6 data nodes. The 6 1TB data nodes can
also be masters (discovery.zen.minimum_master_nodes set to 4). We will
use Logstash and Kibana. Kibana will be used 24/7 by between a couple and
handfuls of people.
Some questions:
Should incoming Logstash write requests be sent to the cluster in
general (using the cluster setting in the elasticsearch output)
or specifically to the client nodes or to the data nodes (via load
balancer)? I am unsure what kind of node is best for handling writes.
If client nodes exist in the cluster are Kibana requests
automatically routed to them? Do I need to somehow specify to Kibana which
nodes to contact?
I have heard different information about master nodes and the
minimum_master_node setting. I've heard that you should have a odd number
of master nodes but I fail to see why the parity of the number of masters
matters as long as minimum_master_node is set to at least N/2 + 1. Does it
really need to be odd?
I have been advised that the client nodes will use huge amount of
memory (which makes sense due to the nature of the Kibana facet queries).
64GB per client node was recommended but I have no idea if that sounds
right or not. I don't have the ability to actually test it right now so any
more guidance on that would be helpful.
I'd be so grateful to hear from you even if you only know something about
one of my queries.
I've done more investigating and it seems that a Client (AKA Query) node
cannot also be a Master node. As it says here
Nodes can be excluded from becoming a master by setting node.master to
false. Note, once a node is a client node (node.client set to true), it
will not be allowed to become a master (node.master is automatically set to
false).
And from the elasticsearch.yml config file it says:
# 2. You want this node to only serve as a master: to not store any data
and # to have free resources. This will be the "coordinator" of your
cluster. # #node.master: true #node.data: false # # 3. You want this node
to be neither master nor data node, but # to act as a "search load
balancer" (fetching data from nodes, # aggregating results,
etc.) # #node.master: false #node.data: false
So I'm wondering how exactly you set up your client nodes to also be master
nodes. It seems like a master node can only either be purely a master or
master + data.
Perhaps you could show the relevant parts of one of your client node's
config?
Many thanks, Alex
On Saturday, 16 August 2014 01:04:37 UTC+1, Mark Walkom wrote:
1 - Up to you. We use the http output and then just use a round robin A
record to our 3 masters.
2 - They are routed but it makes more sense to specify.
3 - You're right, but most people only use 1 or 2 masters which is why
they get recommended to have at least 3.
4 - That sounds like a lot. We use masters that double as clients and they
only have 8GB, our use sounds similar and we don't have issues.
I wouldn't bother with 3 client only nodes to start, use them as master
and client and then if you find you are hitting memory issues due to
queries you can re-evaluate things.
On Wednesday, 13 August 2014 12:10:14 UTC+1, Alex wrote:
Hello I would like some clarification about node types and their usage.
We will have 3 client nodes and 6 data nodes. The 6 1TB data nodes can
also be masters (discovery.zen.minimum_master_nodes set to 4). We will
use Logstash and Kibana. Kibana will be used 24/7 by between a couple and
handfuls of people.
Some questions:
Should incoming Logstash write requests be sent to the cluster in
general (using the cluster setting in the elasticsearch output)
or specifically to the client nodes or to the data nodes (via load
balancer)? I am unsure what kind of node is best for handling writes.
If client nodes exist in the cluster are Kibana requests
automatically routed to them? Do I need to somehow specify to Kibana which
nodes to contact?
I have heard different information about master nodes and the
minimum_master_node setting. I've heard that you should have a odd number
of master nodes but I fail to see why the parity of the number of masters
matters as long as minimum_master_node is set to at least N/2 + 1. Does it
really need to be odd?
I have been advised that the client nodes will use huge amount of
memory (which makes sense due to the nature of the Kibana facet queries).
64GB per client node was recommended but I have no idea if that sounds
right or not. I don't have the ability to actually test it right now so any
more guidance on that would be helpful.
I'd be so grateful to hear from you even if you only know something
about one of my queries.
I've done more investigating and it seems that a Client (AKA Query) node
cannot also be a Master node. As it says here http://www.elasticsearch.
org/guide/en/elasticsearch/reference/current/modules-
discovery-zen.html#master-election
Nodes can be excluded from becoming a master by setting node.master to
false. Note, once a node is a client node (node.client set to true), it
will not be allowed to become a master (node.master is automatically set to
false).
And from the elasticsearch.yml config file it says:
*# 2. You want this node to only serve as a master: to not store any data
and # to have free resources. This will be the "coordinator" of your
cluster. # #node.master: true #node.data: false # # 3. You want this node
to be neither master nor data node, but # to act as a "search load
balancer" (fetching data from nodes, # aggregating results, etc.)
#node.master: false #node.data: false*
So I'm wondering how exactly you set up your client nodes to also be
master nodes. It seems like a master node can only either be purely a
master or master + data.
Perhaps you could show the relevant parts of one of your client node's
config?
Many thanks, Alex
On Saturday, 16 August 2014 01:04:37 UTC+1, Mark Walkom wrote:
1 - Up to you. We use the http output and then just use a round robin A
record to our 3 masters.
2 - They are routed but it makes more sense to specify.
3 - You're right, but most people only use 1 or 2 masters which is why
they get recommended to have at least 3.
4 - That sounds like a lot. We use masters that double as clients and
they only have 8GB, our use sounds similar and we don't have issues.
I wouldn't bother with 3 client only nodes to start, use them as master
and client and then if you find you are hitting memory issues due to
queries you can re-evaluate things.
On Wednesday, 13 August 2014 12:10:14 UTC+1, Alex wrote:
Hello I would like some clarification about node types and their usage.
We will have 3 client nodes and 6 data nodes. The 6 1TB data nodes can
also be masters (discovery.zen.minimum_master_nodes set to 4). We will
use Logstash and Kibana. Kibana will be used 24/7 by between a couple and
handfuls of people.
Some questions:
Should incoming Logstash write requests be sent to the cluster
in general (using the cluster setting in the elasticsearch
output) or specifically to the client nodes or to the data nodes (via load
balancer)? I am unsure what kind of node is best for handling writes.
If client nodes exist in the cluster are Kibana requests
automatically routed to them? Do I need to somehow specify to Kibana which
nodes to contact?
I have heard different information about master nodes and the
minimum_master_node setting. I've heard that you should have a odd number
of master nodes but I fail to see why the parity of the number of masters
matters as long as minimum_master_node is set to at least N/2 + 1. Does it
really need to be odd?
I have been advised that the client nodes will use huge amount
of memory (which makes sense due to the nature of the Kibana facet
queries). 64GB per client node was recommended but I have no idea if that
sounds right or not. I don't have the ability to actually test it right now
so any more guidance on that would be helpful.
I'd be so grateful to hear from you even if you only know something
about one of my queries.
In the Java API, when you start a node, you can set "client(true)" or
"node.client: true".
When becoming a client, a node can no longer store data or become a master.
The node role is also recognized during discovery, so a cluster can decide
how to add the node to the cluster:
node.master = true --> this node is eligible for becoming a master
node.data = true -> this node is added to shard routing
node.client = true -> this node is ignored from master election/shard
routing
I've done more investigating and it seems that a Client (AKA Query) node
cannot also be a Master node. As it says here http://www.elasticsearch.
org/guide/en/elasticsearch/reference/current/modules-
discovery-zen.html#master-election
Nodes can be excluded from becoming a master by setting node.master to
false. Note, once a node is a client node (node.client set to true), it
will not be allowed to become a master (node.master is automatically set to
false).
And from the elasticsearch.yml config file it says:
# 2. You want this node to only serve as a master: to not store any data
and # to have free resources. This will be the "coordinator" of your
cluster. # #node.master: true #node.data: false # # 3. You want this node
to be neither master nor data node, but # to act as a "search load
balancer" (fetching data from nodes, # aggregating results,
etc.) # #node.master: false #node.data: false
So I'm wondering how exactly you set up your client nodes to also be
master nodes. It seems like a master node can only either be purely a
master or master + data.
Perhaps you could show the relevant parts of one of your client node's
config?
Many thanks, Alex
On Saturday, 16 August 2014 01:04:37 UTC+1, Mark Walkom wrote:
1 - Up to you. We use the http output and then just use a round robin A
record to our 3 masters.
2 - They are routed but it makes more sense to specify.
3 - You're right, but most people only use 1 or 2 masters which is why
they get recommended to have at least 3.
4 - That sounds like a lot. We use masters that double as clients and
they only have 8GB, our use sounds similar and we don't have issues.
I wouldn't bother with 3 client only nodes to start, use them as master
and client and then if you find you are hitting memory issues due to
queries you can re-evaluate things.
On Wednesday, 13 August 2014 12:10:14 UTC+1, Alex wrote:
Hello I would like some clarification about node types and their usage.
We will have 3 client nodes and 6 data nodes. The 6 1TB data nodes can
also be masters (discovery.zen.minimum_master_nodes set to 4). We will
use Logstash and Kibana. Kibana will be used 24/7 by between a couple and
handfuls of people.
Some questions:
Should incoming Logstash write requests be sent to the cluster
in general (using the cluster setting in the elasticsearch
output) or specifically to the client nodes or to the data nodes (via load
balancer)? I am unsure what kind of node is best for handling writes.
If client nodes exist in the cluster are Kibana requests
automatically routed to them? Do I need to somehow specify to Kibana which
nodes to contact?
I have heard different information about master nodes and the
minimum_master_node setting. I've heard that you should have a odd number
of master nodes but I fail to see why the parity of the number of masters
matters as long as minimum_master_node is set to at least N/2 + 1. Does it
really need to be odd?
I have been advised that the client nodes will use huge amount
of memory (which makes sense due to the nature of the Kibana facet
queries). 64GB per client node was recommended but I have no idea if that
sounds right or not. I don't have the ability to actually test it right now
so any more guidance on that would be helpful.
I'd be so grateful to hear from you even if you only know something
about one of my queries.
Thank you for your time,
Alex
--
You received this message because you are subscribed to the Google
Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send
an email to elasticsearc...@googlegroups.com.
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.