Elasticsearch cluster spreading the bulk tasks


(Pavel P) #1

Hi,

Could someone clarify me the next:

When I have the ES cluster, consisting from 2 machines, how should I send
the bulk index requests to them.

  1. Do I understand right that I can send everything to any node I have,
    then it would be spreaded for indexing among the cluster automatically?
  2. Do I need to cover the cluster with the load balancer so each node would
    receive some portion of the indexing pressure?

How it supposed to work by design?

Currently I use the load balancer over my two instances, and as I see with
the Bigdesk - the bulk queue is growing on the master node, while the slave
node feels itself quite relaxed.

Master node:

https://lh6.googleusercontent.com/-muM2VK-lKwM/U-I2iLbXh3I/AAAAAAAAAH0/RkYg6DVf7fI/s1600/master_node_queue.png

Slave node:

https://lh5.googleusercontent.com/-Ed4nY_7nrw4/U-I2Yzdb76I/AAAAAAAAAHs/ozL7MnZR1lA/s1600/Screen+Shot+2014-08-06+at+5.05.55+PM.png

Is that ok, that my ES cluster from 2 machines, which are c3.large (4 CPU,
8Gb memory) is only able to index 13k small documents per 10 seconds (I use
it as output for the logstash)?
Which performance should I expect?

Regards,

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/657028e7-ae1e-409f-85b6-faa97f58c500%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


(Pavel P) #2

Still interested to know your view on the issue.

On Wednesday, August 6, 2014 5:12:41 PM UTC+3, Pavel P wrote:

Hi,

Could someone clarify me the next:

When I have the ES cluster, consisting from 2 machines, how should I send
the bulk index requests to them.

  1. Do I understand right that I can send everything to any node I have,
    then it would be spreaded for indexing among the cluster automatically?
  2. Do I need to cover the cluster with the load balancer so each node
    would receive some portion of the indexing pressure?

How it supposed to work by design?

Currently I use the load balancer over my two instances, and as I see with
the Bigdesk - the bulk queue is growing on the master node, while the slave
node feels itself quite relaxed.

Master node:

https://lh6.googleusercontent.com/-muM2VK-lKwM/U-I2iLbXh3I/AAAAAAAAAH0/RkYg6DVf7fI/s1600/master_node_queue.png

Slave node:

https://lh5.googleusercontent.com/-Ed4nY_7nrw4/U-I2Yzdb76I/AAAAAAAAAHs/ozL7MnZR1lA/s1600/Screen+Shot+2014-08-06+at+5.05.55+PM.png

Is that ok, that my ES cluster from 2 machines, which are c3.large (4 CPU,
8Gb memory) is only able to index 13k small documents per 10 seconds (I use
it as output for the logstash)?
Which performance should I expect?

Regards,

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/2fa3fe97-ddfd-403c-98f3-22dc0bd1c70b%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


(Jörg Prante) #3
  1. Yes, it is spread automatically

  2. No

The bulk queue up is where the shards are. So check your shard
distribution. They should be equal on each node for an index. Otherwise
your system load is unbalanced.

Jörg

On Wed, Aug 6, 2014 at 10:36 PM, Pavel P pavel@kredito.de wrote:

Still interested to know your view on the issue.

On Wednesday, August 6, 2014 5:12:41 PM UTC+3, Pavel P wrote:

Hi,

Could someone clarify me the next:

When I have the ES cluster, consisting from 2 machines, how should I send
the bulk index requests to them.

  1. Do I understand right that I can send everything to any node I have,
    then it would be spreaded for indexing among the cluster automatically?
  2. Do I need to cover the cluster with the load balancer so each node
    would receive some portion of the indexing pressure?

How it supposed to work by design?

Currently I use the load balancer over my two instances, and as I see
with the Bigdesk - the bulk queue is growing on the master node, while the
slave node feels itself quite relaxed.

Master node:

https://lh6.googleusercontent.com/-muM2VK-lKwM/U-I2iLbXh3I/AAAAAAAAAH0/RkYg6DVf7fI/s1600/master_node_queue.png

Slave node:

https://lh5.googleusercontent.com/-Ed4nY_7nrw4/U-I2Yzdb76I/AAAAAAAAAHs/ozL7MnZR1lA/s1600/Screen+Shot+2014-08-06+at+5.05.55+PM.png

Is that ok, that my ES cluster from 2 machines, which are c3.large (4
CPU, 8Gb memory) is only able to index 13k small documents per 10 seconds
(I use it as output for the logstash)?
Which performance should I expect?

Regards,

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/2fa3fe97-ddfd-403c-98f3-22dc0bd1c70b%40googlegroups.com
https://groups.google.com/d/msgid/elasticsearch/2fa3fe97-ddfd-403c-98f3-22dc0bd1c70b%40googlegroups.com?utm_medium=email&utm_source=footer
.

For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAKdsXoHOe-0ppbhZ_%2ByROzww7YprMBh8RjrM8WYRiYhExopBAQ%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


(Pavel P) #4

@Jörg

If I have the next situation:
[image: Inline image 1]

Does this mean the all the primary shards are allocated on the one node,
then this one node is indexing all the queries?

If it's true - how could I configure that the primary shards would be
allocated through the cluster equally?

If I should configure those values:
http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/cluster-update-settings.html

Then which configuration values should I set?

Assuming that currently I have the default values set - why the primary
shards are not distributed among the cluster?

Regards,

On Thu, Aug 7, 2014 at 12:56 AM, joergprante@gmail.com <
joergprante@gmail.com> wrote:

  1. Yes, it is spread automatically

  2. No

The bulk queue up is where the shards are. So check your shard
distribution. They should be equal on each node for an index. Otherwise
your system load is unbalanced.

Jörg

On Wed, Aug 6, 2014 at 10:36 PM, Pavel P pavel@kredito.de wrote:

Still interested to know your view on the issue.

On Wednesday, August 6, 2014 5:12:41 PM UTC+3, Pavel P wrote:

Hi,

Could someone clarify me the next:

When I have the ES cluster, consisting from 2 machines, how should I
send the bulk index requests to them.

  1. Do I understand right that I can send everything to any node I have,
    then it would be spreaded for indexing among the cluster automatically?
  2. Do I need to cover the cluster with the load balancer so each node
    would receive some portion of the indexing pressure?

How it supposed to work by design?

Currently I use the load balancer over my two instances, and as I see
with the Bigdesk - the bulk queue is growing on the master node, while the
slave node feels itself quite relaxed.

Master node:

https://lh6.googleusercontent.com/-muM2VK-lKwM/U-I2iLbXh3I/AAAAAAAAAH0/RkYg6DVf7fI/s1600/master_node_queue.png

Slave node:

https://lh5.googleusercontent.com/-Ed4nY_7nrw4/U-I2Yzdb76I/AAAAAAAAAHs/ozL7MnZR1lA/s1600/Screen+Shot+2014-08-06+at+5.05.55+PM.png

Is that ok, that my ES cluster from 2 machines, which are c3.large (4
CPU, 8Gb memory) is only able to index 13k small documents per 10 seconds
(I use it as output for the logstash)?
Which performance should I expect?

Regards,

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/2fa3fe97-ddfd-403c-98f3-22dc0bd1c70b%40googlegroups.com
https://groups.google.com/d/msgid/elasticsearch/2fa3fe97-ddfd-403c-98f3-22dc0bd1c70b%40googlegroups.com?utm_medium=email&utm_source=footer
.

For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to a topic in the
Google Groups "elasticsearch" group.
To unsubscribe from this topic, visit
https://groups.google.com/d/topic/elasticsearch/7XHQjAoKPfw/unsubscribe.
To unsubscribe from this group and all its topics, send an email to
elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/CAKdsXoHOe-0ppbhZ_%2ByROzww7YprMBh8RjrM8WYRiYhExopBAQ%40mail.gmail.com
https://groups.google.com/d/msgid/elasticsearch/CAKdsXoHOe-0ppbhZ_%2ByROzww7YprMBh8RjrM8WYRiYhExopBAQ%40mail.gmail.com?utm_medium=email&utm_source=footer
.

For more options, visit https://groups.google.com/d/optout.

--

Pavel Polyakov

Software Engineer - PHP team

E-mail: pavel@kredito.de
Skype: pavel.polyakov.x1

https://www.facebook.com/kreditech
Kreditech Holding SSL GmbH
Am Sandtorkai 50, 20457 Hamburg, Germany
Office phone: +49 (0)40 - 605905-60
Authorized representatives: Sebastian Diemer, Alexander Graubner-Müller
Company registration: Hamburg HRB122027


facebook.com/kreditech https://www.facebook.com/kreditech

https://www.facebook.com/kreditech

This e-mail contains confidential and/or legally protected information. If
you are not the intended recipient or if you have received this e-mail by
error please notify the sender immediately and destroy this e-mail. Any
unauthorized review, copying, disclosure or distribution of the material in
this e-mail is strictly forbidden. The contents of this e-mail is legally
binding only if it is confirmed by letter or fax. The sending of e-mails to
us does not have any period-protecting effect. Thank you for your
cooperation.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAFVUaqM4M89ey_Ktfc9xvULxW%3DY8otT2_1iuOpomniJmrfWehg%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


(Jörg Prante) #5

You need not to configure primary shards, the automatic allocation of
shards is ok.

If you index, the indexing load will be distributed over primary and
replica.

Because you only have two nodes and one replica, Elasticsearch skips the
replica by default, unless you configure the write consistency from
"quorum" to "all".

If you use three nodes, this will work better, because the quorum fomula is
(n / 2 + 1) for n > 2 and ES writes to at least two nodes.

See also https://github.com/elasticsearch/elasticsearch/issues/6482

Quoting kimchy:

"I will add, that the number 2 is just a tricky number when it comes to
distributed systems. I would argue that either quorum in this case set to
2, or it being set to 1 can be debatable..., since 2 in this case also
means all. The reason we went with the default mentioned is because many
times people run Elasticsearch using 1 node, or 2 (as a search platform for
their database), on top of the just getting started aspect, and they are ok
with potentially needing to reindex the data with the downsides that come
with 1 or 2 nodes."

Jörg

On Thu, Aug 7, 2014 at 11:33 AM, Pavel P pavel@kredito.de wrote:

@Jörg

If I have the next situation:
[image: Inline image 1]

Does this mean the all the primary shards are allocated on the one node,
then this one node is indexing all the queries?

If it's true - how could I configure that the primary shards would be
allocated through the cluster equally?

If I should configure those values:

http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/cluster-update-settings.html

Then which configuration values should I set?

Assuming that currently I have the default values set - why the primary
shards are not distributed among the cluster?

Regards,

On Thu, Aug 7, 2014 at 12:56 AM, joergprante@gmail.com <
joergprante@gmail.com> wrote:

  1. Yes, it is spread automatically

  2. No

The bulk queue up is where the shards are. So check your shard
distribution. They should be equal on each node for an index. Otherwise
your system load is unbalanced.

Jörg

On Wed, Aug 6, 2014 at 10:36 PM, Pavel P pavel@kredito.de wrote:

Still interested to know your view on the issue.

On Wednesday, August 6, 2014 5:12:41 PM UTC+3, Pavel P wrote:

Hi,

Could someone clarify me the next:

When I have the ES cluster, consisting from 2 machines, how should I
send the bulk index requests to them.

  1. Do I understand right that I can send everything to any node I have,
    then it would be spreaded for indexing among the cluster automatically?
  2. Do I need to cover the cluster with the load balancer so each node
    would receive some portion of the indexing pressure?

How it supposed to work by design?

Currently I use the load balancer over my two instances, and as I see
with the Bigdesk - the bulk queue is growing on the master node, while the
slave node feels itself quite relaxed.

Master node:

https://lh6.googleusercontent.com/-muM2VK-lKwM/U-I2iLbXh3I/AAAAAAAAAH0/RkYg6DVf7fI/s1600/master_node_queue.png

Slave node:

https://lh5.googleusercontent.com/-Ed4nY_7nrw4/U-I2Yzdb76I/AAAAAAAAAHs/ozL7MnZR1lA/s1600/Screen+Shot+2014-08-06+at+5.05.55+PM.png

Is that ok, that my ES cluster from 2 machines, which are c3.large (4
CPU, 8Gb memory) is only able to index 13k small documents per 10 seconds
(I use it as output for the logstash)?
Which performance should I expect?

Regards,

--
You received this message because you are subscribed to the Google
Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send
an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/2fa3fe97-ddfd-403c-98f3-22dc0bd1c70b%40googlegroups.com
https://groups.google.com/d/msgid/elasticsearch/2fa3fe97-ddfd-403c-98f3-22dc0bd1c70b%40googlegroups.com?utm_medium=email&utm_source=footer
.

For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to a topic in the
Google Groups "elasticsearch" group.
To unsubscribe from this topic, visit
https://groups.google.com/d/topic/elasticsearch/7XHQjAoKPfw/unsubscribe.
To unsubscribe from this group and all its topics, send an email to
elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/CAKdsXoHOe-0ppbhZ_%2ByROzww7YprMBh8RjrM8WYRiYhExopBAQ%40mail.gmail.com
https://groups.google.com/d/msgid/elasticsearch/CAKdsXoHOe-0ppbhZ_%2ByROzww7YprMBh8RjrM8WYRiYhExopBAQ%40mail.gmail.com?utm_medium=email&utm_source=footer
.

For more options, visit https://groups.google.com/d/optout.

--

Pavel Polyakov

Software Engineer - PHP team

E-mail: pavel@kredito.de
Skype: pavel.polyakov.x1

https://www.facebook.com/kreditech
Kreditech Holding SSL GmbH
Am Sandtorkai 50, 20457 Hamburg, Germany
Office phone: +49 (0)40 - 605905-60
Authorized representatives: Sebastian Diemer, Alexander Graubner-Müller
Company registration: Hamburg HRB122027

www.kreditech.com
facebook.com/kreditech https://www.facebook.com/kreditech

https://www.facebook.com/kreditech

This e-mail contains confidential and/or legally protected information. If
you are not the intended recipient or if you have received this e-mail by
error please notify the sender immediately and destroy this e-mail. Any
unauthorized review, copying, disclosure or distribution of the material in
this e-mail is strictly forbidden. The contents of this e-mail is legally
binding only if it is confirmed by letter or fax. The sending of e-mails to
us does not have any period-protecting effect. Thank you for your
cooperation.

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/CAFVUaqM4M89ey_Ktfc9xvULxW%3DY8otT2_1iuOpomniJmrfWehg%40mail.gmail.com
https://groups.google.com/d/msgid/elasticsearch/CAFVUaqM4M89ey_Ktfc9xvULxW%3DY8otT2_1iuOpomniJmrfWehg%40mail.gmail.com?utm_medium=email&utm_source=footer
.

For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAKdsXoGn4OGO6OVDu1PQWattP3O2AO8crKKySB%2BLx-c%2Bg_zDbQ%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


(Pavel P) #6

Thanks Jörg,

As soon as I add 3rd node, the distribution of the primary shards was
changed.

Regards,

On Thu, Aug 7, 2014 at 12:59 PM, joergprante@gmail.com <
joergprante@gmail.com> wrote:

You need not to configure primary shards, the automatic allocation of
shards is ok.

If you index, the indexing load will be distributed over primary and
replica.

Because you only have two nodes and one replica, Elasticsearch skips the
replica by default, unless you configure the write consistency from
"quorum" to "all".

If you use three nodes, this will work better, because the quorum fomula
is (n / 2 + 1) for n > 2 and ES writes to at least two nodes.

See also https://github.com/elasticsearch/elasticsearch/issues/6482

Quoting kimchy:

"I will add, that the number 2 is just a tricky number when it comes to
distributed systems. I would argue that either quorum in this case set to
2, or it being set to 1 can be debatable..., since 2 in this case also
means all. The reason we went with the default mentioned is because many
times people run Elasticsearch using 1 node, or 2 (as a search platform for
their database), on top of the just getting started aspect, and they are ok
with potentially needing to reindex the data with the downsides that come
with 1 or 2 nodes."

Jörg

On Thu, Aug 7, 2014 at 11:33 AM, Pavel P pavel@kredito.de wrote:

@Jörg

If I have the next situation:
[image: Inline image 1]

Does this mean the all the primary shards are allocated on the one node,
then this one node is indexing all the queries?

If it's true - how could I configure that the primary shards would be
allocated through the cluster equally?

If I should configure those values:

http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/cluster-update-settings.html

Then which configuration values should I set?

Assuming that currently I have the default values set - why the primary
shards are not distributed among the cluster?

Regards,

On Thu, Aug 7, 2014 at 12:56 AM, joergprante@gmail.com <
joergprante@gmail.com> wrote:

  1. Yes, it is spread automatically

  2. No

The bulk queue up is where the shards are. So check your shard
distribution. They should be equal on each node for an index. Otherwise
your system load is unbalanced.

Jörg

On Wed, Aug 6, 2014 at 10:36 PM, Pavel P pavel@kredito.de wrote:

Still interested to know your view on the issue.

On Wednesday, August 6, 2014 5:12:41 PM UTC+3, Pavel P wrote:

Hi,

Could someone clarify me the next:

When I have the ES cluster, consisting from 2 machines, how should I
send the bulk index requests to them.

  1. Do I understand right that I can send everything to any node I
    have, then it would be spreaded for indexing among the cluster
    automatically?
  2. Do I need to cover the cluster with the load balancer so each node
    would receive some portion of the indexing pressure?

How it supposed to work by design?

Currently I use the load balancer over my two instances, and as I see
with the Bigdesk - the bulk queue is growing on the master node, while the
slave node feels itself quite relaxed.

Master node:

https://lh6.googleusercontent.com/-muM2VK-lKwM/U-I2iLbXh3I/AAAAAAAAAH0/RkYg6DVf7fI/s1600/master_node_queue.png

Slave node:

https://lh5.googleusercontent.com/-Ed4nY_7nrw4/U-I2Yzdb76I/AAAAAAAAAHs/ozL7MnZR1lA/s1600/Screen+Shot+2014-08-06+at+5.05.55+PM.png

Is that ok, that my ES cluster from 2 machines, which are c3.large (4
CPU, 8Gb memory) is only able to index 13k small documents per 10 seconds
(I use it as output for the logstash)?
Which performance should I expect?

Regards,

--
You received this message because you are subscribed to the Google
Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send
an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/2fa3fe97-ddfd-403c-98f3-22dc0bd1c70b%40googlegroups.com
https://groups.google.com/d/msgid/elasticsearch/2fa3fe97-ddfd-403c-98f3-22dc0bd1c70b%40googlegroups.com?utm_medium=email&utm_source=footer
.

For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to a topic in the
Google Groups "elasticsearch" group.
To unsubscribe from this topic, visit
https://groups.google.com/d/topic/elasticsearch/7XHQjAoKPfw/unsubscribe.
To unsubscribe from this group and all its topics, send an email to
elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/CAKdsXoHOe-0ppbhZ_%2ByROzww7YprMBh8RjrM8WYRiYhExopBAQ%40mail.gmail.com
https://groups.google.com/d/msgid/elasticsearch/CAKdsXoHOe-0ppbhZ_%2ByROzww7YprMBh8RjrM8WYRiYhExopBAQ%40mail.gmail.com?utm_medium=email&utm_source=footer
.

For more options, visit https://groups.google.com/d/optout.

--

Pavel Polyakov

Software Engineer - PHP team

E-mail: pavel@kredito.de
Skype: pavel.polyakov.x1

https://www.facebook.com/kreditech
Kreditech Holding SSL GmbH
Am Sandtorkai 50, 20457 Hamburg, Germany
Office phone: +49 (0)40 - 605905-60
Authorized representatives: Sebastian Diemer, Alexander Graubner-Müller
Company registration: Hamburg HRB122027

www.kreditech.com
facebook.com/kreditech https://www.facebook.com/kreditech

https://www.facebook.com/kreditech

This e-mail contains confidential and/or legally protected information.
If you are not the intended recipient or if you have received this e-mail
by error please notify the sender immediately and destroy this e-mail. Any
unauthorized review, copying, disclosure or distribution of the material in
this e-mail is strictly forbidden. The contents of this e-mail is legally
binding only if it is confirmed by letter or fax. The sending of e-mails to
us does not have any period-protecting effect. Thank you for your
cooperation.

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/CAFVUaqM4M89ey_Ktfc9xvULxW%3DY8otT2_1iuOpomniJmrfWehg%40mail.gmail.com
https://groups.google.com/d/msgid/elasticsearch/CAFVUaqM4M89ey_Ktfc9xvULxW%3DY8otT2_1iuOpomniJmrfWehg%40mail.gmail.com?utm_medium=email&utm_source=footer
.

For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to a topic in the
Google Groups "elasticsearch" group.
To unsubscribe from this topic, visit
https://groups.google.com/d/topic/elasticsearch/7XHQjAoKPfw/unsubscribe.
To unsubscribe from this group and all its topics, send an email to
elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/CAKdsXoGn4OGO6OVDu1PQWattP3O2AO8crKKySB%2BLx-c%2Bg_zDbQ%40mail.gmail.com
https://groups.google.com/d/msgid/elasticsearch/CAKdsXoGn4OGO6OVDu1PQWattP3O2AO8crKKySB%2BLx-c%2Bg_zDbQ%40mail.gmail.com?utm_medium=email&utm_source=footer
.

For more options, visit https://groups.google.com/d/optout.

--

Pavel Polyakov

Software Engineer - PHP team

E-mail: pavel@kredito.de
Skype: pavel.polyakov.x1

https://www.facebook.com/kreditech
Kreditech Holding SSL GmbH
Am Sandtorkai 50, 20457 Hamburg, Germany
Office phone: +49 (0)40 - 605905-60
Authorized representatives: Sebastian Diemer, Alexander Graubner-Müller
Company registration: Hamburg HRB122027


facebook.com/kreditech https://www.facebook.com/kreditech

https://www.facebook.com/kreditech

This e-mail contains confidential and/or legally protected information. If
you are not the intended recipient or if you have received this e-mail by
error please notify the sender immediately and destroy this e-mail. Any
unauthorized review, copying, disclosure or distribution of the material in
this e-mail is strictly forbidden. The contents of this e-mail is legally
binding only if it is confirmed by letter or fax. The sending of e-mails to
us does not have any period-protecting effect. Thank you for your
cooperation.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAFVUaqO%3DADmRwJVurS9cntdPo2i55iRHB81m9MwBPqawnwyXTQ%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


(Pavel P) #7

Hi Jörg,

After your help and the cluster update, currently I can index everything
well, but the different problem happened.

I'm not able to search :slight_smile:

Could you please look to this issue:
https://groups.google.com/forum/#!topic/elasticsearch/jbHg99gqRf0

And share your expertise?

Regards,

On Thu, Aug 7, 2014 at 1:36 PM, Pavel P pavel@kredito.de wrote:

Thanks Jörg,

As soon as I add 3rd node, the distribution of the primary shards was
changed.

Regards,

On Thu, Aug 7, 2014 at 12:59 PM, joergprante@gmail.com <
joergprante@gmail.com> wrote:

You need not to configure primary shards, the automatic allocation of
shards is ok.

If you index, the indexing load will be distributed over primary and
replica.

Because you only have two nodes and one replica, Elasticsearch skips the
replica by default, unless you configure the write consistency from
"quorum" to "all".

If you use three nodes, this will work better, because the quorum fomula
is (n / 2 + 1) for n > 2 and ES writes to at least two nodes.

See also https://github.com/elasticsearch/elasticsearch/issues/6482

Quoting kimchy:

"I will add, that the number 2 is just a tricky number when it comes to
distributed systems. I would argue that either quorum in this case set to
2, or it being set to 1 can be debatable..., since 2 in this case also
means all. The reason we went with the default mentioned is because many
times people run Elasticsearch using 1 node, or 2 (as a search platform for
their database), on top of the just getting started aspect, and they are ok
with potentially needing to reindex the data with the downsides that come
with 1 or 2 nodes."

Jörg

On Thu, Aug 7, 2014 at 11:33 AM, Pavel P pavel@kredito.de wrote:

@Jörg

If I have the next situation:
[image: Inline image 1]

Does this mean the all the primary shards are allocated on the one node,
then this one node is indexing all the queries?

If it's true - how could I configure that the primary shards would be
allocated through the cluster equally?

If I should configure those values:

http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/cluster-update-settings.html

Then which configuration values should I set?

Assuming that currently I have the default values set - why the primary
shards are not distributed among the cluster?

Regards,

On Thu, Aug 7, 2014 at 12:56 AM, joergprante@gmail.com <
joergprante@gmail.com> wrote:

  1. Yes, it is spread automatically

  2. No

The bulk queue up is where the shards are. So check your shard
distribution. They should be equal on each node for an index. Otherwise
your system load is unbalanced.

Jörg

On Wed, Aug 6, 2014 at 10:36 PM, Pavel P pavel@kredito.de wrote:

Still interested to know your view on the issue.

On Wednesday, August 6, 2014 5:12:41 PM UTC+3, Pavel P wrote:

Hi,

Could someone clarify me the next:

When I have the ES cluster, consisting from 2 machines, how should I
send the bulk index requests to them.

  1. Do I understand right that I can send everything to any node I
    have, then it would be spreaded for indexing among the cluster
    automatically?
  2. Do I need to cover the cluster with the load balancer so each node
    would receive some portion of the indexing pressure?

How it supposed to work by design?

Currently I use the load balancer over my two instances, and as I see
with the Bigdesk - the bulk queue is growing on the master node, while the
slave node feels itself quite relaxed.

Master node:

https://lh6.googleusercontent.com/-muM2VK-lKwM/U-I2iLbXh3I/AAAAAAAAAH0/RkYg6DVf7fI/s1600/master_node_queue.png

Slave node:

https://lh5.googleusercontent.com/-Ed4nY_7nrw4/U-I2Yzdb76I/AAAAAAAAAHs/ozL7MnZR1lA/s1600/Screen+Shot+2014-08-06+at+5.05.55+PM.png

Is that ok, that my ES cluster from 2 machines, which are c3.large (4
CPU, 8Gb memory) is only able to index 13k small documents per 10 seconds
(I use it as output for the logstash)?
Which performance should I expect?

Regards,

--
You received this message because you are subscribed to the Google
Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send
an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/2fa3fe97-ddfd-403c-98f3-22dc0bd1c70b%40googlegroups.com
https://groups.google.com/d/msgid/elasticsearch/2fa3fe97-ddfd-403c-98f3-22dc0bd1c70b%40googlegroups.com?utm_medium=email&utm_source=footer
.

For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to a topic in the
Google Groups "elasticsearch" group.
To unsubscribe from this topic, visit
https://groups.google.com/d/topic/elasticsearch/7XHQjAoKPfw/unsubscribe
.
To unsubscribe from this group and all its topics, send an email to
elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/CAKdsXoHOe-0ppbhZ_%2ByROzww7YprMBh8RjrM8WYRiYhExopBAQ%40mail.gmail.com
https://groups.google.com/d/msgid/elasticsearch/CAKdsXoHOe-0ppbhZ_%2ByROzww7YprMBh8RjrM8WYRiYhExopBAQ%40mail.gmail.com?utm_medium=email&utm_source=footer
.

For more options, visit https://groups.google.com/d/optout.

--

Pavel Polyakov

Software Engineer - PHP team

E-mail: pavel@kredito.de
Skype: pavel.polyakov.x1

https://www.facebook.com/kreditech
Kreditech Holding SSL GmbH
Am Sandtorkai 50, 20457 Hamburg, Germany
Office phone: +49 (0)40 - 605905-60
Authorized representatives: Sebastian Diemer, Alexander Graubner-Müller
Company registration: Hamburg HRB122027

www.kreditech.com
facebook.com/kreditech https://www.facebook.com/kreditech

https://www.facebook.com/kreditech

This e-mail contains confidential and/or legally protected information.
If you are not the intended recipient or if you have received this e-mail
by error please notify the sender immediately and destroy this e-mail. Any
unauthorized review, copying, disclosure or distribution of the material in
this e-mail is strictly forbidden. The contents of this e-mail is legally
binding only if it is confirmed by letter or fax. The sending of e-mails to
us does not have any period-protecting effect. Thank you for your
cooperation.

--
You received this message because you are subscribed to the Google
Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send
an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/CAFVUaqM4M89ey_Ktfc9xvULxW%3DY8otT2_1iuOpomniJmrfWehg%40mail.gmail.com
https://groups.google.com/d/msgid/elasticsearch/CAFVUaqM4M89ey_Ktfc9xvULxW%3DY8otT2_1iuOpomniJmrfWehg%40mail.gmail.com?utm_medium=email&utm_source=footer
.

For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to a topic in the
Google Groups "elasticsearch" group.
To unsubscribe from this topic, visit
https://groups.google.com/d/topic/elasticsearch/7XHQjAoKPfw/unsubscribe.
To unsubscribe from this group and all its topics, send an email to
elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/CAKdsXoGn4OGO6OVDu1PQWattP3O2AO8crKKySB%2BLx-c%2Bg_zDbQ%40mail.gmail.com
https://groups.google.com/d/msgid/elasticsearch/CAKdsXoGn4OGO6OVDu1PQWattP3O2AO8crKKySB%2BLx-c%2Bg_zDbQ%40mail.gmail.com?utm_medium=email&utm_source=footer
.

For more options, visit https://groups.google.com/d/optout.

--

Pavel Polyakov

Software Engineer - PHP team

E-mail: pavel@kredito.de
Skype: pavel.polyakov.x1

https://www.facebook.com/kreditech
Kreditech Holding SSL GmbH
Am Sandtorkai 50, 20457 Hamburg, Germany
Office phone: +49 (0)40 - 605905-60
Authorized representatives: Sebastian Diemer, Alexander Graubner-Müller
Company registration: Hamburg HRB122027

www.kreditech.com
facebook.com/kreditech https://www.facebook.com/kreditech

https://www.facebook.com/kreditech

This e-mail contains confidential and/or legally protected information. If
you are not the intended recipient or if you have received this e-mail by
error please notify the sender immediately and destroy this e-mail. Any
unauthorized review, copying, disclosure or distribution of the material in
this e-mail is strictly forbidden. The contents of this e-mail is legally
binding only if it is confirmed by letter or fax. The sending of e-mails to
us does not have any period-protecting effect. Thank you for your
cooperation.

--

Pavel Polyakov

Software Engineer - PHP team

E-mail: pavel@kredito.de
Skype: pavel.polyakov.x1

https://www.facebook.com/kreditech
Kreditech Holding SSL GmbH
Am Sandtorkai 50, 20457 Hamburg, Germany
Office phone: +49 (0)40 - 605905-60
Authorized representatives: Sebastian Diemer, Alexander Graubner-Müller
Company registration: Hamburg HRB122027


facebook.com/kreditech https://www.facebook.com/kreditech

https://www.facebook.com/kreditech

This e-mail contains confidential and/or legally protected information. If
you are not the intended recipient or if you have received this e-mail by
error please notify the sender immediately and destroy this e-mail. Any
unauthorized review, copying, disclosure or distribution of the material in
this e-mail is strictly forbidden. The contents of this e-mail is legally
binding only if it is confirmed by letter or fax. The sending of e-mails to
us does not have any period-protecting effect. Thank you for your
cooperation.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAFVUaqPVYyitqvz7RaJhV4DY7a5JA-sXK1sBm0BQw6Le0JA5SA%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


(system) #8