How to reach a cluster using http no matter which machine is master

One of the top level things I dnoj't understand about large installations
is how to reach the cluster when a master node fails and a new master is
elected.

Let's say that I/we have this scenario:

5 machines in cluster (as individual nodes)

192.169.10.0 black.mycompany.com
192.169.10.1 brown.mycompany.com
192.169.10.2 red.mycompany.com
192.169.10.3 orange.mycompany.com
192.169.10.4 green.mycompany.com

cluster name ElasticColors
Machines/nodes are also application servers (at a certain point in the
growth of the site).

Question 1/ How does an application running on any of those machines reach
the cluster?
Question 2/ Let's assume I'm using a load balancer in front of those
machines to the application servers. The application server happends to be
healthy and the ES instance on that machine dies. Will it be able to reach
the cluster? How would it choose another way to reach the cluster if it
couldn't?

Any other issues or better design ideas?

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/fd279bc4-a76e-43cb-8ea5-3362bd3f5151%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

From what I've read, the application needs to round robin it's selection of
these IP addresses, and check for valid access at each one until it finds
one it has success with. The client needs to know all the IP addresses or
FQDN of servers in order to access them all if there's a lot of failrues
going on. It's possible to add nodes to the cluster (wth or without FQDNs?)
and the application know nothing. But it will never be able to use those to
access the cluster.

After that one of two scenarios happens:

For READS, the client found to be operational, does its own round robin
internal to ES and switches between the primary shard then the replica
shards.
For MODIFYING actions, the client found to be operational, sends the
request to the primary shard ONLY. This primary shard is chosen by the
cluster and the application does not need to know anyting about it. After
the modification, the changes are sent to mirrored in the replica shards.
Then the request is acknowledged by the primary shard as complete, and the
client gets the status/answer. This is the DEFAULT 'sync' behavior. It is
NOT RECOMMENDED but possible to change to 'async' behavior and the primary
shard will reply with just its own success and thenn send the mirror
requests to the replicas.

Anyone who knows better, please correct me.

On Friday, February 27, 2015 at 5:38:04 PM UTC-8, Dennis wrote:

One of the top level things I dnoj't understand about large installations
is how to reach the cluster when a master node fails and a new master is
elected.

Let's say that I/we have this scenario:

5 machines in cluster (as individual nodes)

192.169.10.0 black.mycompany.com
192.169.10.1 brown.mycompany.com
192.169.10.2 red.mycompany.com
192.169.10.3 orange.mycompany.com
192.169.10.4 green.mycompany.com

cluster name ElasticColors
Machines/nodes are also application servers (at a certain point in the
growth of the site).

Question 1/ How does an application running on any of those machines
reach the cluster?
Question 2/ Let's assume I'm using a load balancer in front of those
machines to the application servers. The application server happends to be
healthy and the ES instance on that machine dies. Will it be able to reach
the cluster? How would it choose another way to reach the cluster if it
couldn't?

Any other issues or better design ideas?

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/65e20781-a7b1-497c-a988-4bca20d594c4%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

There are several misunderstandings.

For the Java API, there is node discovery, and a fault detection mechanism,
which sends pings (heartbeats) each five seconds. When a node dies, clients
are aware of it, and do no longer use faulty nodes. New nodes are added
automatically, this is called "sniff mode" (can be disabled).

Regarding the shard replication, this is drive by a write quorum. The
default is "async" which means the write call returns after a quorum of
successful writes on nodes has been met. Another option is "sync", which
means that all replicas must respond before the write call returns to the
client.

Jörg

On Sat, Feb 28, 2015 at 4:43 AM, Dennis gearond@gmail.com wrote:

From what I've read, the application needs to round robin it's selection
of these IP addresses, and check for valid access at each one until it
finds one it has success with. The client needs to know all the IP
addresses or FQDN of servers in order to access them all if there's a lot
of failrues going on. It's possible to add nodes to the cluster (wth or
without FQDNs?) and the application know nothing. But it will never be able
to use those to access the cluster.

After that one of two scenarios happens:

For READS, the client found to be operational, does its own round robin
internal to ES and switches between the primary shard then the replica
shards.
For MODIFYING actions, the client found to be operational, sends the
request to the primary shard ONLY. This primary shard is chosen by the
cluster and the application does not need to know anyting about it. After
the modification, the changes are sent to mirrored in the replica shards.
Then the request is acknowledged by the primary shard as complete, and the
client gets the status/answer. This is the DEFAULT 'sync' behavior. It is
NOT RECOMMENDED but possible to change to 'async' behavior and the primary
shard will reply with just its own success and thenn send the mirror
requests to the replicas.

Anyone who knows better, please correct me.

On Friday, February 27, 2015 at 5:38:04 PM UTC-8, Dennis wrote:

One of the top level things I dnoj't understand about large installations
is how to reach the cluster when a master node fails and a new master is
elected.

Let's say that I/we have this scenario:

5 machines in cluster (as individual nodes)

192.169.10.0 black.mycompany.com
192.169.10.1 brown.mycompany.com
192.169.10.2 red.mycompany.com
192.169.10.3 orange.mycompany.com
192.169.10.4 green.mycompany.com

cluster name ElasticColors
Machines/nodes are also application servers (at a certain point in the
growth of the site).

Question 1/ How does an application running on any of those machines
reach the cluster?
Question 2/ Let's assume I'm using a load balancer in front of those
machines to the application servers. The application server happends to be
healthy and the ES instance on that machine dies. Will it be able to reach
the cluster? How would it choose another way to reach the cluster if it
couldn't?

Any other issues or better design ideas?

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/65e20781-a7b1-497c-a988-4bca20d594c4%40googlegroups.com
https://groups.google.com/d/msgid/elasticsearch/65e20781-a7b1-497c-a988-4bca20d594c4%40googlegroups.com?utm_medium=email&utm_source=footer
.

For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAKdsXoGRkDirBX4nKcYPaACRCtxc4p_YkOG%3Di7Wc%3DtVqt2GWkg%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

Default value for replication is sync.

Write consistency set to Quorum is something else here. By default, the node that get the index request will accept or reject the request depending on the number of primary+replica available for a shard. Default to quorum but with an exception for a one replica only.
It means that if you set 3 replicas, and have only 1 replica allocated, your cluster will reject the index request.

Details here: Elasticsearch Platform — Find real-time answers at scale | Elastic

David

Le 28 févr. 2015 à 09:19, "joergprante@gmail.com" joergprante@gmail.com a écrit :

There are several misunderstandings.

For the Java API, there is node discovery, and a fault detection mechanism, which sends pings (heartbeats) each five seconds. When a node dies, clients are aware of it, and do no longer use faulty nodes. New nodes are added automatically, this is called "sniff mode" (can be disabled).

Regarding the shard replication, this is drive by a write quorum. The default is "async" which means the write call returns after a quorum of successful writes on nodes has been met. Another option is "sync", which means that all replicas must respond before the write call returns to the client.

Jörg

On Sat, Feb 28, 2015 at 4:43 AM, Dennis gearond@gmail.com wrote:
From what I've read, the application needs to round robin it's selection of these IP addresses, and check for valid access at each one until it finds one it has success with. The client needs to know all the IP addresses or FQDN of servers in order to access them all if there's a lot of failrues going on. It's possible to add nodes to the cluster (wth or without FQDNs?) and the application know nothing. But it will never be able to use those to access the cluster.

After that one of two scenarios happens:

For READS, the client found to be operational, does its own round robin internal to ES and switches between the primary shard then the replica shards.
For MODIFYING actions, the client found to be operational, sends the request to the primary shard ONLY. This primary shard is chosen by the cluster and the application does not need to know anyting about it. After the modification, the changes are sent to mirrored in the replica shards. Then the request is acknowledged by the primary shard as complete, and the client gets the status/answer. This is the DEFAULT 'sync' behavior. It is NOT RECOMMENDED but possible to change to 'async' behavior and the primary shard will reply with just its own success and thenn send the mirror requests to the replicas.

Anyone who knows better, please correct me.

On Friday, February 27, 2015 at 5:38:04 PM UTC-8, Dennis wrote:
One of the top level things I dnoj't understand about large installations is how to reach the cluster when a master node fails and a new master is elected.

Let's say that I/we have this scenario:

5 machines in cluster (as individual nodes)

192.169.10.0 black.mycompany.com
192.169.10.1 brown.mycompany.com
192.169.10.2 red.mycompany.com
192.169.10.3 orange.mycompany.com
192.169.10.4 green.mycompany.com

cluster name ElasticColors
Machines/nodes are also application servers (at a certain point in the growth of the site).

Question 1/ How does an application running on any of those machines reach the cluster?
Question 2/ Let's assume I'm using a load balancer in front of those machines to the application servers. The application server happends to be healthy and the ES instance on that machine dies. Will it be able to reach the cluster? How would it choose another way to reach the cluster if it couldn't?

Any other issues or better design ideas?

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/65e20781-a7b1-497c-a988-4bca20d594c4%40googlegroups.com.

For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAKdsXoGRkDirBX4nKcYPaACRCtxc4p_YkOG%3Di7Wc%3DtVqt2GWkg%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/03CC430E-1A0F-4298-810D-F16B60889D8F%40pilato.fr.
For more options, visit https://groups.google.com/d/optout.

So David, can you comment on Jorg's assertion that the Java API takes care
of node failures and node additions?

I plan on using PHP and http(s) on the local network. Tell me if I am
correct, but I probably will have to:
A/ use a unicast list
B/ have to write some kind of daemon or cron based script that will query
st the list and if there's a failure, kill the old node and restart a new
one at the same network host name and IP

On Saturday, February 28, 2015 at 12:41:13 AM UTC-8, David Pilato wrote:

Default value for replication is sync.

Elasticsearch Platform — Find real-time answers at scale | Elastic

Write consistency set to Quorum is something else here. By default, the
node that get the index request will accept or reject the request depending
on the number of primary+replica available for a shard. Default to quorum
but with an exception for a one replica only.
It means that if you set 3 replicas, and have only 1 replica allocated,
your cluster will reject the index request.

Details here:
Elasticsearch Platform — Find real-time answers at scale | Elastic

David

Le 28 févr. 2015 à 09:19, "joerg...@gmail.com <javascript:>" <
joerg...@gmail.com <javascript:>> a écrit :

There are several misunderstandings.

For the Java API, there is node discovery, and a fault detection
mechanism, which sends pings (heartbeats) each five seconds. When a node
dies, clients are aware of it, and do no longer use faulty nodes. New nodes
are added automatically, this is called "sniff mode" (can be disabled).

Regarding the shard replication, this is drive by a write quorum. The
default is "async" which means the write call returns after a quorum of
successful writes on nodes has been met. Another option is "sync", which
means that all replicas must respond before the write call returns to the
client.

Jörg

On Sat, Feb 28, 2015 at 4:43 AM, Dennis <gea...@gmail.com <javascript:>>
wrote:

From what I've read, the application needs to round robin it's selection
of these IP addresses, and check for valid access at each one until it
finds one it has success with. The client needs to know all the IP
addresses or FQDN of servers in order to access them all if there's a lot
of failrues going on. It's possible to add nodes to the cluster (wth or
without FQDNs?) and the application know nothing. But it will never be able
to use those to access the cluster.

After that one of two scenarios happens:

For READS, the client found to be operational, does its own round robin
internal to ES and switches between the primary shard then the replica
shards.
For MODIFYING actions, the client found to be operational, sends the
request to the primary shard ONLY. This primary shard is chosen by the
cluster and the application does not need to know anyting about it. After
the modification, the changes are sent to mirrored in the replica shards.
Then the request is acknowledged by the primary shard as complete, and the
client gets the status/answer. This is the DEFAULT 'sync' behavior. It is
NOT RECOMMENDED but possible to change to 'async' behavior and the primary
shard will reply with just its own success and thenn send the mirror
requests to the replicas.

Anyone who knows better, please correct me.

On Friday, February 27, 2015 at 5:38:04 PM UTC-8, Dennis wrote:

One of the top level things I dnoj't understand about large
installations is how to reach the cluster when a master node fails and a
new master is elected.

Let's say that I/we have this scenario:

5 machines in cluster (as individual nodes)

192.169.10.0 black.mycompany.com
192.169.10.1 brown.mycompany.com
192.169.10.2 red.mycompany.com
192.169.10.3 orange.mycompany.com
192.169.10.4 green.mycompany.com

cluster name ElasticColors
Machines/nodes are also application servers (at a certain point in the
growth of the site).

Question 1/ How does an application running on any of those machines
reach the cluster?
Question 2/ Let's assume I'm using a load balancer in front of those
machines to the application servers. The application server happends to be
healthy and the ES instance on that machine dies. Will it be able to reach
the cluster? How would it choose another way to reach the cluster if it
couldn't?

Any other issues or better design ideas?

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearc...@googlegroups.com <javascript:>.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/65e20781-a7b1-497c-a988-4bca20d594c4%40googlegroups.com
https://groups.google.com/d/msgid/elasticsearch/65e20781-a7b1-497c-a988-4bca20d594c4%40googlegroups.com?utm_medium=email&utm_source=footer
.

For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearc...@googlegroups.com <javascript:>.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/CAKdsXoGRkDirBX4nKcYPaACRCtxc4p_YkOG%3Di7Wc%3DtVqt2GWkg%40mail.gmail.com
https://groups.google.com/d/msgid/elasticsearch/CAKdsXoGRkDirBX4nKcYPaACRCtxc4p_YkOG%3Di7Wc%3DtVqt2GWkg%40mail.gmail.com?utm_medium=email&utm_source=footer
.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/882222a3-6a92-4570-9029-4177f6540956%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Jorg is correct.

About PHP, have a look at PHP client.
This page could help: Elasticsearch Platform — Find real-time answers at scale | Elastic http://www.elasticsearch.org/guide/en/elasticsearch/client/php-api/current/_the_connection_pool.html

--
David Pilato | Technical Advocate | Elasticsearch.com
@dadoonet https://twitter.com/dadoonet | @elasticsearchfr https://twitter.com/elasticsearchfr | @scrutmydocs https://twitter.com/scrutmydocs

Le 28 févr. 2015 à 18:17, Dennis gearond@gmail.com a écrit :

So David, can you comment on Jorg's assertion that the Java API takes care of node failures and node additions?

I plan on using PHP and http(s) on the local network. Tell me if I am correct, but I probably will have to:
A/ use a unicast list
B/ have to write some kind of daemon or cron based script that will query st the list and if there's a failure, kill the old node and restart a new one at the same network host name and IP

On Saturday, February 28, 2015 at 12:41:13 AM UTC-8, David Pilato wrote:
Default value for replication is sync.
Elasticsearch Platform — Find real-time answers at scale | Elastic http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/docs-index_.html#index-replication

Write consistency set to Quorum is something else here. By default, the node that get the index request will accept or reject the request depending on the number of primary+replica available for a shard. Default to quorum but with an exception for a one replica only.
It means that if you set 3 replicas, and have only 1 replica allocated, your cluster will reject the index request.

Details here: Elasticsearch Platform — Find real-time answers at scale | Elastic http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/docs-index_.html#index-consistency

David

Le 28 févr. 2015 à 09:19, "joerg...@ <>gmail.com http://gmail.com/" <joerg...@ <>gmail.com http://gmail.com/> a écrit :

There are several misunderstandings.

For the Java API, there is node discovery, and a fault detection mechanism, which sends pings (heartbeats) each five seconds. When a node dies, clients are aware of it, and do no longer use faulty nodes. New nodes are added automatically, this is called "sniff mode" (can be disabled).

Regarding the shard replication, this is drive by a write quorum. The default is "async" which means the write call returns after a quorum of successful writes on nodes has been met. Another option is "sync", which means that all replicas must respond before the write call returns to the client.

Jörg

On Sat, Feb 28, 2015 at 4:43 AM, Dennis <gea...@ <>gmail.com http://gmail.com/> wrote:
From what I've read, the application needs to round robin it's selection of these IP addresses, and check for valid access at each one until it finds one it has success with. The client needs to know all the IP addresses or FQDN of servers in order to access them all if there's a lot of failrues going on. It's possible to add nodes to the cluster (wth or without FQDNs?) and the application know nothing. But it will never be able to use those to access the cluster.

After that one of two scenarios happens:

For READS, the client found to be operational, does its own round robin internal to ES and switches between the primary shard then the replica shards.
For MODIFYING actions, the client found to be operational, sends the request to the primary shard ONLY. This primary shard is chosen by the cluster and the application does not need to know anyting about it. After the modification, the changes are sent to mirrored in the replica shards. Then the request is acknowledged by the primary shard as complete, and the client gets the status/answer. This is the DEFAULT 'sync' behavior. It is NOT RECOMMENDED but possible to change to 'async' behavior and the primary shard will reply with just its own success and thenn send the mirror requests to the replicas.

Anyone who knows better, please correct me.

On Friday, February 27, 2015 at 5:38:04 PM UTC-8, Dennis wrote:
One of the top level things I dnoj't understand about large installations is how to reach the cluster when a master node fails and a new master is elected.

Let's say that I/we have this scenario:

5 machines in cluster (as individual nodes)

192.169.10.0 black.mycompany.com http://black.mycompany.com/
192.169.10.1 brown.mycompany.com http://brown.mycompany.com/
192.169.10.2 red.mycompany.com http://red.mycompany.com/
192.169.10.3 orange.mycompany.com http://orange.mycompany.com/
192.169.10.4 green.mycompany.com http://green.mycompany.com/

cluster name ElasticColors
Machines/nodes are also application servers (at a certain point in the growth of the site).

Question 1/ How does an application running on any of those machines reach the cluster?
Question 2/ Let's assume I'm using a load balancer in front of those machines to the application servers. The application server happends to be healthy and the ES instance on that machine dies. Will it be able to reach the cluster? How would it choose another way to reach the cluster if it couldn't?

Any other issues or better design ideas?

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearc...@ <>googlegroups.com http://googlegroups.com/.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/65e20781-a7b1-497c-a988-4bca20d594c4%40googlegroups.com https://groups.google.com/d/msgid/elasticsearch/65e20781-a7b1-497c-a988-4bca20d594c4%40googlegroups.com?utm_medium=email&utm_source=footer.

For more options, visit https://groups.google.com/d/optout https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearc...@ <>googlegroups.com http://googlegroups.com/.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAKdsXoGRkDirBX4nKcYPaACRCtxc4p_YkOG%3Di7Wc%3DtVqt2GWkg%40mail.gmail.com https://groups.google.com/d/msgid/elasticsearch/CAKdsXoGRkDirBX4nKcYPaACRCtxc4p_YkOG%3Di7Wc%3DtVqt2GWkg%40mail.gmail.com?utm_medium=email&utm_source=footer.
For more options, visit https://groups.google.com/d/optout https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com mailto:elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/882222a3-6a92-4570-9029-4177f6540956%40googlegroups.com https://groups.google.com/d/msgid/elasticsearch/882222a3-6a92-4570-9029-4177f6540956%40googlegroups.com?utm_medium=email&utm_source=footer.
For more options, visit https://groups.google.com/d/optout https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/79B2F61C-B7A3-4100-915C-50ABABBBD86C%40pilato.fr.
For more options, visit https://groups.google.com/d/optout.

The PHP client (and other clients other than Java) is (are) new to me!! Any
speed difference vs using http to localhost? I would assume that is one
major purpose for it.

On Saturday, February 28, 2015 at 9:29:48 AM UTC-8, David Pilato wrote:

Jorg is correct.

About PHP, have a look at PHP client.
This page could help:
Elasticsearch Platform — Find real-time answers at scale | Elastic

--
David Pilato | Technical Advocate | Elasticsearch.com
http://Elasticsearch.com

@dadoonet https://twitter.com/dadoonet | @elasticsearchfr
https://twitter.com/elasticsearchfr | @scrutmydocs
https://twitter.com/scrutmydocs

Le 28 févr. 2015 à 18:17, Dennis <gea...@gmail.com <javascript:>> a écrit
:

So David, can you comment on Jorg's assertion that the Java API takes care
of node failures and node additions?

I plan on using PHP and http(s) on the local network. Tell me if I am
correct, but I probably will have to:
A/ use a unicast list
B/ have to write some kind of daemon or cron based script that will query
st the list and if there's a failure, kill the old node and restart a new
one at the same network host name and IP

On Saturday, February 28, 2015 at 12:41:13 AM UTC-8, David Pilato wrote:

Default value for replication is sync.

Elasticsearch Platform — Find real-time answers at scale | Elastic

Write consistency set to Quorum is something else here. By default, the
node that get the index request will accept or reject the request depending
on the number of primary+replica available for a shard. Default to quorum
but with an exception for a one replica only.
It means that if you set 3 replicas, and have only 1 replica allocated,
your cluster will reject the index request.

Details here:
Elasticsearch Platform — Find real-time answers at scale | Elastic

David

Le 28 févr. 2015 à 09:19, "joerg...@gmail.com" joerg...@gmail.com a
écrit :

There are several misunderstandings.

For the Java API, there is node discovery, and a fault detection
mechanism, which sends pings (heartbeats) each five seconds. When a node
dies, clients are aware of it, and do no longer use faulty nodes. New nodes
are added automatically, this is called "sniff mode" (can be disabled).

Regarding the shard replication, this is drive by a write quorum. The
default is "async" which means the write call returns after a quorum of
successful writes on nodes has been met. Another option is "sync", which
means that all replicas must respond before the write call returns to the
client.

Jörg

On Sat, Feb 28, 2015 at 4:43 AM, Dennis gea...@gmail.com wrote:

From what I've read, the application needs to round robin it's selection
of these IP addresses, and check for valid access at each one until it
finds one it has success with. The client needs to know all the IP
addresses or FQDN of servers in order to access them all if there's a lot
of failrues going on. It's possible to add nodes to the cluster (wth or
without FQDNs?) and the application know nothing. But it will never be able
to use those to access the cluster.

After that one of two scenarios happens:

For READS, the client found to be operational, does its own round robin
internal to ES and switches between the primary shard then the replica
shards.
For MODIFYING actions, the client found to be operational, sends the
request to the primary shard ONLY. This primary shard is chosen by the
cluster and the application does not need to know anyting about it. After
the modification, the changes are sent to mirrored in the replica shards.
Then the request is acknowledged by the primary shard as complete, and the
client gets the status/answer. This is the DEFAULT 'sync' behavior. It is
NOT RECOMMENDED but possible to change to 'async' behavior and the primary
shard will reply with just its own success and thenn send the mirror
requests to the replicas.

Anyone who knows better, please correct me.

On Friday, February 27, 2015 at 5:38:04 PM UTC-8, Dennis wrote:

One of the top level things I dnoj't understand about large
installations is how to reach the cluster when a master node fails and a
new master is elected.

Let's say that I/we have this scenario:

5 machines in cluster (as individual nodes)

192.169.10.0 black.mycompany.com
192.169.10.1 brown.mycompany.com
192.169.10.2 red.mycompany.com
192.169.10.3 orange.mycompany.com
192.169.10.4 green.mycompany.com

cluster name ElasticColors
Machines/nodes are also application servers (at a certain point in the
growth of the site).

Question 1/ How does an application running on any of those machines
reach the cluster?
Question 2/ Let's assume I'm using a load balancer in front of those
machines to the application servers. The application server happends to be
healthy and the ES instance on that machine dies. Will it be able to reach
the cluster? How would it choose another way to reach the cluster if it
couldn't?

Any other issues or better design ideas?

--
You received this message because you are subscribed to the Google
Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send
an email to elasticsearc...@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/65e20781-a7b1-497c-a988-4bca20d594c4%40googlegroups.com
https://groups.google.com/d/msgid/elasticsearch/65e20781-a7b1-497c-a988-4bca20d594c4%40googlegroups.com?utm_medium=email&utm_source=footer
.

For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearc...@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/CAKdsXoGRkDirBX4nKcYPaACRCtxc4p_YkOG%3Di7Wc%3DtVqt2GWkg%40mail.gmail.com
https://groups.google.com/d/msgid/elasticsearch/CAKdsXoGRkDirBX4nKcYPaACRCtxc4p_YkOG%3Di7Wc%3DtVqt2GWkg%40mail.gmail.com?utm_medium=email&utm_source=footer
.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearc...@googlegroups.com <javascript:>.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/882222a3-6a92-4570-9029-4177f6540956%40googlegroups.com
https://groups.google.com/d/msgid/elasticsearch/882222a3-6a92-4570-9029-4177f6540956%40googlegroups.com?utm_medium=email&utm_source=footer
.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/d2caa8c9-59a7-499b-8cd7-47228afc861e%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Some time ago, the Elasticsearch team released a series of official
Elasticsearch clients, with comparable feature sets

Not sure what speed difference you are after, since all HTTP traffic comes
with some percentage of overhead, but this can be neglected in the vast
majority of ES scenarios. I recommend to set up a test bed for yourself and
take some measurements, so you can decide what setup can meet your
requirements.

Jörg

On Sat, Feb 28, 2015 at 8:47 PM, Dennis gearond@gmail.com wrote:

The PHP client (and other clients other than Java) is (are) new to me!!
Any speed difference vs using http to localhost? I would assume that is one
major purpose for it.

On Saturday, February 28, 2015 at 9:29:48 AM UTC-8, David Pilato wrote:

Jorg is correct.

About PHP, have a look at PHP client.
This page could help: Elasticsearch Platform — Find real-time answers at scale | Elastic
elasticsearch/client/php-api/current/_the_connection_pool.html

--
David Pilato | Technical Advocate | Elasticsearch.com
http://Elasticsearch.com

@dadoonet https://twitter.com/dadoonet | @elasticsearchfr
https://twitter.com/elasticsearchfr | @scrutmydocs
https://twitter.com/scrutmydocs

Le 28 févr. 2015 à 18:17, Dennis gea...@gmail.com a écrit :

So David, can you comment on Jorg's assertion that the Java API takes
care of node failures and node additions?

I plan on using PHP and http(s) on the local network. Tell me if I am
correct, but I probably will have to:
A/ use a unicast list
B/ have to write some kind of daemon or cron based script that will query
st the list and if there's a failure, kill the old node and restart a new
one at the same network host name and IP

On Saturday, February 28, 2015 at 12:41:13 AM UTC-8, David Pilato wrote:

Default value for replication is sync.
Elasticsearch Platform — Find real-time answers at scale | Elastic
reference/current/docs-index_.html#index-replication

Write consistency set to Quorum is something else here. By default, the
node that get the index request will accept or reject the request depending
on the number of primary+replica available for a shard. Default to quorum
but with an exception for a one replica only.
It means that if you set 3 replicas, and have only 1 replica allocated,
your cluster will reject the index request.

Details here: Elasticsearch Platform — Find real-time answers at scale | Elastic
elasticsearch/reference/current/docs-index_.html#index-consistency

David

Le 28 févr. 2015 à 09:19, "joerg...@gmail.com" joerg...@gmail.com a
écrit :

There are several misunderstandings.

For the Java API, there is node discovery, and a fault detection
mechanism, which sends pings (heartbeats) each five seconds. When a node
dies, clients are aware of it, and do no longer use faulty nodes. New nodes
are added automatically, this is called "sniff mode" (can be disabled).

Regarding the shard replication, this is drive by a write quorum. The
default is "async" which means the write call returns after a quorum of
successful writes on nodes has been met. Another option is "sync", which
means that all replicas must respond before the write call returns to the
client.

Jörg

On Sat, Feb 28, 2015 at 4:43 AM, Dennis gea...@gmail.com wrote:

From what I've read, the application needs to round robin it's
selection of these IP addresses, and check for valid access at each one
until it finds one it has success with. The client needs to know all the IP
addresses or FQDN of servers in order to access them all if there's a lot
of failrues going on. It's possible to add nodes to the cluster (wth or
without FQDNs?) and the application know nothing. But it will never be able
to use those to access the cluster.

After that one of two scenarios happens:

For READS, the client found to be operational, does its own round robin
internal to ES and switches between the primary shard then the replica
shards.
For MODIFYING actions, the client found to be operational, sends the
request to the primary shard ONLY. This primary shard is chosen by the
cluster and the application does not need to know anyting about it. After
the modification, the changes are sent to mirrored in the replica shards.
Then the request is acknowledged by the primary shard as complete, and the
client gets the status/answer. This is the DEFAULT 'sync' behavior. It is
NOT RECOMMENDED but possible to change to 'async' behavior and the primary
shard will reply with just its own success and thenn send the mirror
requests to the replicas.

Anyone who knows better, please correct me.

On Friday, February 27, 2015 at 5:38:04 PM UTC-8, Dennis wrote:

One of the top level things I dnoj't understand about large
installations is how to reach the cluster when a master node fails and a
new master is elected.

Let's say that I/we have this scenario:

5 machines in cluster (as individual nodes)

192.169.10.0 black.mycompany.com
192.169.10.1 brown.mycompany.com
192.169.10.2 red.mycompany.com
192.169.10.3 orange.mycompany.com
192.169.10.4 green.mycompany.com

cluster name ElasticColors
Machines/nodes are also application servers (at a certain point in the
growth of the site).

Question 1/ How does an application running on any of those machines
reach the cluster?
Question 2/ Let's assume I'm using a load balancer in front of those
machines to the application servers. The application server happends to be
healthy and the ES instance on that machine dies. Will it be able to reach
the cluster? How would it choose another way to reach the cluster if it
couldn't?

Any other issues or better design ideas?

--
You received this message because you are subscribed to the Google
Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send
an email to elasticsearc...@googlegroups.com.
To view this discussion on the web visit https://groups.google.
com/d/msgid/elasticsearch/65e20781-a7b1-497c-a988-
4bca20d594c4%40googlegroups.com
https://groups.google.com/d/msgid/elasticsearch/65e20781-a7b1-497c-a988-4bca20d594c4%40googlegroups.com?utm_medium=email&utm_source=footer
.

For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google
Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send
an email to elasticsearc...@googlegroups.com.
To view this discussion on the web visit https://groups.google.
com/d/msgid/elasticsearch/CAKdsXoGRkDirBX4nKcYPaACRCtxc4
p_YkOG%3Di7Wc%3DtVqt2GWkg%40mail.gmail.com
https://groups.google.com/d/msgid/elasticsearch/CAKdsXoGRkDirBX4nKcYPaACRCtxc4p_YkOG%3Di7Wc%3DtVqt2GWkg%40mail.gmail.com?utm_medium=email&utm_source=footer
.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearc...@googlegroups.com.
To view this discussion on the web visit https://groups.google.
com/d/msgid/elasticsearch/882222a3-6a92-4570-9029-
4177f6540956%40googlegroups.com
https://groups.google.com/d/msgid/elasticsearch/882222a3-6a92-4570-9029-4177f6540956%40googlegroups.com?utm_medium=email&utm_source=footer
.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/d2caa8c9-59a7-499b-8cd7-47228afc861e%40googlegroups.com
https://groups.google.com/d/msgid/elasticsearch/d2caa8c9-59a7-499b-8cd7-47228afc861e%40googlegroups.com?utm_medium=email&utm_source=footer
.

For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAKdsXoHo_KSbabk-5p%3DjPFEujRncJ62-kcvvLTMp4Gh79R2-Dw%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

Thanks, Jorg. I am just going to go with the client.

On Saturday, February 28, 2015 at 12:27:26 PM UTC-8, Jörg Prante wrote:

Some time ago, the Elasticsearch team released a series of official
Elasticsearch clients, with comparable feature sets

Elasticsearch Platform — Find real-time answers at scale | Elastic

Not sure what speed difference you are after, since all HTTP traffic comes
with some percentage of overhead, but this can be neglected in the vast
majority of ES scenarios. I recommend to set up a test bed for yourself and
take some measurements, so you can decide what setup can meet your
requirements.

Jörg

On Sat, Feb 28, 2015 at 8:47 PM, Dennis <gea...@gmail.com <javascript:>>
wrote:

The PHP client (and other clients other than Java) is (are) new to me!!
Any speed difference vs using http to localhost? I would assume that is one
major purpose for it.

On Saturday, February 28, 2015 at 9:29:48 AM UTC-8, David Pilato wrote:

Jorg is correct.

About PHP, have a look at PHP client.
This page could help: Elasticsearch Platform — Find real-time answers at scale | Elastic
elasticsearch/client/php-api/current/_the_connection_pool.html

--
David Pilato | Technical Advocate | Elasticsearch.com
http://Elasticsearch.com

@dadoonet https://twitter.com/dadoonet | @elasticsearchfr
https://twitter.com/elasticsearchfr | @scrutmydocs
https://twitter.com/scrutmydocs

Le 28 févr. 2015 à 18:17, Dennis gea...@gmail.com a écrit :

So David, can you comment on Jorg's assertion that the Java API takes
care of node failures and node additions?

I plan on using PHP and http(s) on the local network. Tell me if I am
correct, but I probably will have to:
A/ use a unicast list
B/ have to write some kind of daemon or cron based script that will
query st the list and if there's a failure, kill the old node and restart a
new one at the same network host name and IP

On Saturday, February 28, 2015 at 12:41:13 AM UTC-8, David Pilato wrote:

Default value for replication is sync.
Elasticsearch Platform — Find real-time answers at scale | Elastic
reference/current/docs-index_.html#index-replication

Write consistency set to Quorum is something else here. By default, the
node that get the index request will accept or reject the request depending
on the number of primary+replica available for a shard. Default to quorum
but with an exception for a one replica only.
It means that if you set 3 replicas, and have only 1 replica allocated,
your cluster will reject the index request.

Details here: Elasticsearch Platform — Find real-time answers at scale | Elastic
elasticsearch/reference/current/docs-index_.html#index-consistency

David

Le 28 févr. 2015 à 09:19, "joerg...@gmail.com" joerg...@gmail.com a
écrit :

There are several misunderstandings.

For the Java API, there is node discovery, and a fault detection
mechanism, which sends pings (heartbeats) each five seconds. When a node
dies, clients are aware of it, and do no longer use faulty nodes. New nodes
are added automatically, this is called "sniff mode" (can be disabled).

Regarding the shard replication, this is drive by a write quorum. The
default is "async" which means the write call returns after a quorum of
successful writes on nodes has been met. Another option is "sync", which
means that all replicas must respond before the write call returns to the
client.

Jörg

On Sat, Feb 28, 2015 at 4:43 AM, Dennis gea...@gmail.com wrote:

From what I've read, the application needs to round robin it's
selection of these IP addresses, and check for valid access at each one
until it finds one it has success with. The client needs to know all the IP
addresses or FQDN of servers in order to access them all if there's a lot
of failrues going on. It's possible to add nodes to the cluster (wth or
without FQDNs?) and the application know nothing. But it will never be able
to use those to access the cluster.

After that one of two scenarios happens:

For READS, the client found to be operational, does its own round
robin internal to ES and switches between the primary shard then the
replica shards.
For MODIFYING actions, the client found to be operational, sends the
request to the primary shard ONLY. This primary shard is chosen by the
cluster and the application does not need to know anyting about it. After
the modification, the changes are sent to mirrored in the replica shards.
Then the request is acknowledged by the primary shard as complete, and the
client gets the status/answer. This is the DEFAULT 'sync' behavior. It is
NOT RECOMMENDED but possible to change to 'async' behavior and the primary
shard will reply with just its own success and thenn send the mirror
requests to the replicas.

Anyone who knows better, please correct me.

On Friday, February 27, 2015 at 5:38:04 PM UTC-8, Dennis wrote:

One of the top level things I dnoj't understand about large
installations is how to reach the cluster when a master node fails and a
new master is elected.

Let's say that I/we have this scenario:

5 machines in cluster (as individual nodes)

192.169.10.0 black.mycompany.com
192.169.10.1 brown.mycompany.com
192.169.10.2 red.mycompany.com
192.169.10.3 orange.mycompany.com
192.169.10.4 green.mycompany.com

cluster name ElasticColors
Machines/nodes are also application servers (at a certain point in
the growth of the site).

Question 1/ How does an application running on any of those machines
reach the cluster?
Question 2/ Let's assume I'm using a load balancer in front of those
machines to the application servers. The application server happends to be
healthy and the ES instance on that machine dies. Will it be able to reach
the cluster? How would it choose another way to reach the cluster if it
couldn't?

Any other issues or better design ideas?

--
You received this message because you are subscribed to the Google
Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send
an email to elasticsearc...@googlegroups.com.
To view this discussion on the web visit https://groups.google.
com/d/msgid/elasticsearch/65e20781-a7b1-497c-a988-
4bca20d594c4%40googlegroups.com
https://groups.google.com/d/msgid/elasticsearch/65e20781-a7b1-497c-a988-4bca20d594c4%40googlegroups.com?utm_medium=email&utm_source=footer
.

For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google
Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send
an email to elasticsearc...@googlegroups.com.
To view this discussion on the web visit https://groups.google.
com/d/msgid/elasticsearch/CAKdsXoGRkDirBX4nKcYPaACRCtxc4
p_YkOG%3Di7Wc%3DtVqt2GWkg%40mail.gmail.com
https://groups.google.com/d/msgid/elasticsearch/CAKdsXoGRkDirBX4nKcYPaACRCtxc4p_YkOG%3Di7Wc%3DtVqt2GWkg%40mail.gmail.com?utm_medium=email&utm_source=footer
.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google
Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send
an email to elasticsearc...@googlegroups.com.
To view this discussion on the web visit https://groups.google.
com/d/msgid/elasticsearch/882222a3-6a92-4570-9029-
4177f6540956%40googlegroups.com
https://groups.google.com/d/msgid/elasticsearch/882222a3-6a92-4570-9029-4177f6540956%40googlegroups.com?utm_medium=email&utm_source=footer
.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearc...@googlegroups.com <javascript:>.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/d2caa8c9-59a7-499b-8cd7-47228afc861e%40googlegroups.com
https://groups.google.com/d/msgid/elasticsearch/d2caa8c9-59a7-499b-8cd7-47228afc861e%40googlegroups.com?utm_medium=email&utm_source=footer
.

For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/16115bf0-88e6-4928-8d71-5a807c2a5e1b%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.