Ideal setup for EC2 cluster Config

I know I have more reading to do, but some of the terms are still confusing
to me.
I would like to index about 5-6 million documents, doc is about 5k in size.
Everything lives on the cloud, feeder instances and index instances.
The 3 nodes are eligible to be master nodes.

Should I make the three nodes masters, should they be "node.data" = true?
Does making the 3 nodes masters reduce split brain problem?

Can someone elaborate on what I need to be thinking about while trying to
come up with the best configuration?

Thanks

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/bfcfc42a-3e19-4bff-88b9-b6beb6044bde%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

I also wanted to add, while reading up on this topic, dedicating a master
only node kept popping up.

With this configuration:
node 1 {master=yes, node.data = false}
node 2 {master=no, node.data = yes}
node 3 {master=no, node.data = yes}

What happens if node1 the master encounters a network issue?

On Wednesday, August 20, 2014 1:30:17 PM UTC-4, IronMan2014 wrote:

I know I have more reading to do, but some of the terms are still
confusing to me.
I would like to index about 5-6 million documents, doc is about 5k in
size.
Everything lives on the cloud, feeder instances and index instances.
The 3 nodes are eligible to be master nodes.

Should I make the three nodes masters, should they be "node.data" = true?
Does making the 3 nodes masters reduce split brain problem?

Can someone elaborate on what I need to be thinking about while trying to
come up with the best configuration?

Thanks

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/177c46b5-29ce-4386-965b-619fa6ff7cb9%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

If you only have 3 nodes, I would just stick to the defaults, which is both
master and data.

Having dedicated master (no data) nodes helps because it helps eliminate
OOM pressures since the actual data lives elsewhere. With so few nodes,
every machine should hold a portion of the data. Dedicated master nodes
make sense when you have a bigger cluster IMHO.

Besides the number of documents and their size, the most important metric
is how many fields are indexed and what type. Dozens of analyzed string
fields per document might encountered memory problems. Few fields and
mainly numerical data? You would have no problems.

I would start off with the defaults and only make changes if you notice
some erroneous pattern.

Cheers,

Ivan

On Wed, Aug 20, 2014 at 11:14 AM, IronMan2014 sabdalla80@gmail.com wrote:

I also wanted to add, while reading up on this topic, dedicating a master
only node kept popping up.

With this configuration:
node 1 {master=yes, node.data = false}
node 2 {master=no, node.data = yes}
node 3 {master=no, node.data = yes}

What happens if node1 the master encounters a network issue?

On Wednesday, August 20, 2014 1:30:17 PM UTC-4, IronMan2014 wrote:

I know I have more reading to do, but some of the terms are still
confusing to me.
I would like to index about 5-6 million documents, doc is about 5k in
size.
Everything lives on the cloud, feeder instances and index instances.
The 3 nodes are eligible to be master nodes.

Should I make the three nodes masters, should they be "node.data" = true?
Does making the 3 nodes masters reduce split brain problem?

Can someone elaborate on what I need to be thinking about while trying to
come up with the best configuration?

Thanks

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/177c46b5-29ce-4386-965b-619fa6ff7cb9%40googlegroups.com
https://groups.google.com/d/msgid/elasticsearch/177c46b5-29ce-4386-965b-619fa6ff7cb9%40googlegroups.com?utm_medium=email&utm_source=footer
.

For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CALY%3DcQAYE1RUUWbJH%3Dw1UCpd9vQY_3Tg9XDZ5t%3DhokmgLEfEyw%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

So if you have a 9 node cluster... and you want to have 3 master eligible
nodes... do you need 3 machines with master=yes, data=false?

That would mean 2 machines just sit around doing nothing until something is
wrong with the master. Is that correct? I'm trying to set up a large
cluster with decent redundancy...

Also, what's the benefit of 'query nodes': master=no, data=no?

-Ankit

On Wednesday, August 20, 2014 2:11:51 PM UTC-7, Ivan Brusic wrote:

If you only have 3 nodes, I would just stick to the defaults, which is
both master and data.

Having dedicated master (no data) nodes helps because it helps eliminate
OOM pressures since the actual data lives elsewhere. With so few nodes,
every machine should hold a portion of the data. Dedicated master nodes
make sense when you have a bigger cluster IMHO.

Besides the number of documents and their size, the most important metric
is how many fields are indexed and what type. Dozens of analyzed string
fields per document might encountered memory problems. Few fields and
mainly numerical data? You would have no problems.

I would start off with the defaults and only make changes if you notice
some erroneous pattern.

Cheers,

Ivan

On Wed, Aug 20, 2014 at 11:14 AM, IronMan2014 <sabda...@gmail.com
<javascript:>> wrote:

I also wanted to add, while reading up on this topic, dedicating a master
only node kept popping up.

With this configuration:
node 1 {master=yes, node.data = false}
node 2 {master=no, node.data = yes}
node 3 {master=no, node.data = yes}

What happens if node1 the master encounters a network issue?

On Wednesday, August 20, 2014 1:30:17 PM UTC-4, IronMan2014 wrote:

I know I have more reading to do, but some of the terms are still
confusing to me.
I would like to index about 5-6 million documents, doc is about 5k in
size.
Everything lives on the cloud, feeder instances and index instances.
The 3 nodes are eligible to be master nodes.

Should I make the three nodes masters, should they be "node.data" =
true?
Does making the 3 nodes masters reduce split brain problem?

Can someone elaborate on what I need to be thinking about while trying
to come up with the best configuration?

Thanks

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearc...@googlegroups.com <javascript:>.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/177c46b5-29ce-4386-965b-619fa6ff7cb9%40googlegroups.com
https://groups.google.com/d/msgid/elasticsearch/177c46b5-29ce-4386-965b-619fa6ff7cb9%40googlegroups.com?utm_medium=email&utm_source=footer
.

For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/0a7071bd-e0c5-4436-875c-96f20a57a9ae%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Yes, but there is nothing stopping you from sending queries to these master
nodes if you wish.

Client nodes can be useful if you do a lot of, or heavy queries.

Regards,
Mark Walkom

Infrastructure Engineer
Campaign Monitor
email: markw@campaignmonitor.com
web: www.campaignmonitor.com

On 21 August 2014 07:29, Ankit Jain ankit@quettra.com wrote:

So if you have a 9 node cluster... and you want to have 3 master eligible
nodes... do you need 3 machines with master=yes, data=false?

That would mean 2 machines just sit around doing nothing until something
is wrong with the master. Is that correct? I'm trying to set up a large
cluster with decent redundancy...

Also, what's the benefit of 'query nodes': master=no, data=no?

-Ankit

On Wednesday, August 20, 2014 2:11:51 PM UTC-7, Ivan Brusic wrote:

If you only have 3 nodes, I would just stick to the defaults, which is
both master and data.

Having dedicated master (no data) nodes helps because it helps eliminate
OOM pressures since the actual data lives elsewhere. With so few nodes,
every machine should hold a portion of the data. Dedicated master nodes
make sense when you have a bigger cluster IMHO.

Besides the number of documents and their size, the most important metric
is how many fields are indexed and what type. Dozens of analyzed string
fields per document might encountered memory problems. Few fields and
mainly numerical data? You would have no problems.

I would start off with the defaults and only make changes if you notice
some erroneous pattern.

Cheers,

Ivan

On Wed, Aug 20, 2014 at 11:14 AM, IronMan2014 sabda...@gmail.com wrote:

I also wanted to add, while reading up on this topic, dedicating a
master only node kept popping up.

With this configuration:
node 1 {master=yes, node.data = false}
node 2 {master=no, node.data = yes}
node 3 {master=no, node.data = yes}

What happens if node1 the master encounters a network issue?

On Wednesday, August 20, 2014 1:30:17 PM UTC-4, IronMan2014 wrote:

I know I have more reading to do, but some of the terms are still
confusing to me.
I would like to index about 5-6 million documents, doc is about 5k in
size.
Everything lives on the cloud, feeder instances and index instances.
The 3 nodes are eligible to be master nodes.

Should I make the three nodes masters, should they be "node.data" =
true?
Does making the 3 nodes masters reduce split brain problem?

Can someone elaborate on what I need to be thinking about while trying
to come up with the best configuration?

Thanks

--
You received this message because you are subscribed to the Google
Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send
an email to elasticsearc...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/
msgid/elasticsearch/177c46b5-29ce-4386-965b-619fa6ff7cb9%
40googlegroups.com
https://groups.google.com/d/msgid/elasticsearch/177c46b5-29ce-4386-965b-619fa6ff7cb9%40googlegroups.com?utm_medium=email&utm_source=footer
.

For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/0a7071bd-e0c5-4436-875c-96f20a57a9ae%40googlegroups.com
https://groups.google.com/d/msgid/elasticsearch/0a7071bd-e0c5-4436-875c-96f20a57a9ae%40googlegroups.com?utm_medium=email&utm_source=footer
.

For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAEM624Z6Aj5i8JrvKKQt0XUfJO2woi5cJnq4-e2%2B%2BF9gX7CvYw%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.