Benefits of using bulk with node client?

I can definitely see the point of using the bulk API when indexing via
HTTP.

But is there an advantage of using bulk instead of individual index request
when using the client node? Since the node parses the bulk and routes each
request to its proper destination - and it's basically doing the same when
you submit individual requests - what is the benefit of doing a bulk
request in this case?

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/28d54f90-f6b8-449a-806f-e873600dfdd5%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Definitely !

Try with and without and you will see the difference.

The node does not parse the full doc but only headers and streams your docs to the right shards.

I noticed myself a huge difference between both.

--
David :wink:
Twitter : @dadoonet / @elasticsearchfr / @scrutmydocs

Le 10 nov. 2014 à 07:34, Rotem rotem.hermon@gmail.com a écrit :

I can definitely see the point of using the bulk API when indexing via HTTP.

But is there an advantage of using bulk instead of individual index request when using the client node? Since the node parses the bulk and routes each request to its proper destination - and it's basically doing the same when you submit individual requests - what is the benefit of doing a bulk request in this case?

You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/28d54f90-f6b8-449a-806f-e873600dfdd5%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/1C44D026-16C2-42BD-A531-5B79DF49D91B%40pilato.fr.
For more options, visit https://groups.google.com/d/optout.

Ah, thanks, that's helpful to know!


But if that's the case - why does the node need to parse the
document when doing an individual request and not bulk? It can also
stream the doc to the right shard based on the meta data (id and
routing) without parsing the doc, same as it does with the bulk API,
no?

On 11/10/2014 8:53, David Pilato wrote:

Definitely !

Try with and without and you will see the difference.

The node does not parse the full doc but only headers and
streams your docs to the right shards.

I noticed myself a huge difference between both.

--
David :wink:
Twitter : @dadoonet / @elasticsearchfr / @scrutmydocs

    Le 10 nov. 2014 à 07:34, Rotem &lt;<a moz-do-not-send="true" href="mailto:rotem.hermon@gmail.com">rotem.hermon@gmail.com</a>&gt;
    a écrit :
I can definitely see the point of using the bulk API when indexing via HTTP.
        But is there an advantage of using bulk instead of
        individual index request when using the client node? Since
        the node parses the bulk and routes each request to its
        proper destination - and it's basically doing the same when
        you submit individual requests - what is the benefit of
        doing a bulk request in this case?


      -- 


      You received this message because you are subscribed to the
      Google Groups "elasticsearch" group.


      To unsubscribe from this group and stop receiving emails from
      it, send an email to <a moz-do-not-send="true" href="mailto:elasticsearch+unsubscribe@googlegroups.com">elasticsearch+unsubscribe@googlegroups.com</a>.


      To view this discussion on the web visit <a moz-do-not-send="true" href="https://groups.google.com/d/msgid/elasticsearch/28d54f90-f6b8-449a-806f-e873600dfdd5%40googlegroups.com?utm_medium=email&amp;utm_source=footer">https://groups.google.com/d/msgid/elasticsearch/28d54f90-f6b8-449a-806f-e873600dfdd5%40googlegroups.com</a>.


      For more options, visit <a moz-do-not-send="true" href="https://groups.google.com/d/optout">https://groups.google.com/d/optout</a>.
--
  You received this message because you are subscribed to a topic in
  the Google Groups "elasticsearch" group.


  To unsubscribe from this topic, visit <a moz-do-not-send="true" href="https://groups.google.com/d/topic/elasticsearch/rnusTTvNTfg/unsubscribe">https://groups.google.com/d/topic/elasticsearch/rnusTTvNTfg/unsubscribe</a>.


  To unsubscribe from this group and all its topics, send an email
  to <a moz-do-not-send="true" href="mailto:elasticsearch+unsubscribe@googlegroups.com">elasticsearch+unsubscribe@googlegroups.com</a>.


  To view this discussion on the web visit <a moz-do-not-send="true" href="https://groups.google.com/d/msgid/elasticsearch/1C44D026-16C2-42BD-A531-5B79DF49D91B%40pilato.fr?utm_medium=email&amp;utm_source=footer">https://groups.google.com/d/msgid/elasticsearch/1C44D026-16C2-42BD-A531-5B79DF49D91B%40pilato.fr</a>.


  For more options, visit <a moz-do-not-send="true" href="https://groups.google.com/d/optout">https://groups.google.com/d/optout</a>.

The node does not parse the bulk, only part of it (the metadata lines for
hashing and routing).

The benefit of bulk requests are simple to see on the network layer.

Assume 1000 docs:

  • without bulk, send a request per doc, and wait for response each doc:
    client must submit 1000 packets on the wire, and server must submit 1000
    responses on the wire back, and for each doc on inner shard level
    send/receive cycle, there are also another 1000 send/receive. Makes around
    4000 packets on the wire (worst case is the connected server node does not
    hold a shard of the index), with all the delays.

  • with bulk, client submits 1 request, server submits subpackets to each
    node that holds a shard of the index and submits 1 response back. Makes
    around 1 + (2 * n) + 1 packets where n is the number of nodes. With 3
    nodes, you have 8 packets instead of 4000.

Same holds for both HTTP and transport protocol, HTTP is only used for
accepting client requests.

Jörg

On Mon, Nov 10, 2014 at 7:34 AM, Rotem rotem.hermon@gmail.com wrote:

I can definitely see the point of using the bulk API when indexing via
HTTP.

But is there an advantage of using bulk instead of individual index
request when using the client node? Since the node parses the bulk and
routes each request to its proper destination - and it's basically doing
the same when you submit individual requests - what is the benefit of doing
a bulk request in this case?

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/28d54f90-f6b8-449a-806f-e873600dfdd5%40googlegroups.com
https://groups.google.com/d/msgid/elasticsearch/28d54f90-f6b8-449a-806f-e873600dfdd5%40googlegroups.com?utm_medium=email&utm_source=footer
.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAKdsXoHwtoUKSN%2B-D0jiPrVHaMoUygWTtgR-%3Di84Mn0jPCZSYw%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

Makes sense. Thanks!

On 11/10/2014 10:15,
joergprante@gmail.com wrote:

The node does not parse the bulk, only part of it (the metadata lines for hashing and routing).

The benefit of bulk requests are simple to see on the
network layer.

Assume 1000 docs:

  • without bulk, send a request per doc, and wait for
    response each doc: client must submit 1000 packets on the
    wire, and server must submit 1000 responses on the wire
    back, and for each doc on inner shard level send/receive
    cycle, there are also another 1000 send/receive. Makes
    around 4000 packets on the wire (worst case is the connected
    server node does not hold a shard of the index), with all
    the delays.

  • with bulk, client submits 1 request, server submits
    subpackets to each node that holds a shard of the index and
    submits 1 response back. Makes around 1 + (2 * n) + 1
    packets where n is the number of nodes. With 3 nodes, you
    have 8 packets instead of 4000.

Same holds for both HTTP and transport protocol, HTTP is
only used for accepting client requests.

Jörg

On Mon, Nov 10, 2014 at 7:34 AM, Rotem
<rotem.hermon@gmail.com>
wrote:

I can definitely see the point of using the bulk API when indexing via HTTP.
          But is there an advantage of using bulk instead of
          individual index request when using the client node? Since
          the node parses the bulk and routes each request to its
          proper destination - and it's basically doing the same
          when you submit individual requests - what is the benefit
          of doing a bulk request in this case?


            -- 


            You received this message because you are subscribed to
            the Google Groups "elasticsearch" group.


            To unsubscribe from this group and stop receiving emails
            from it, send an email to <a moz-do-not-send="true" href="mailto:elasticsearch+unsubscribe@googlegroups.com" target="_blank">elasticsearch+unsubscribe@googlegroups.com</a>.


            To view this discussion on the web visit <a moz-do-not-send="true" href="https://groups.google.com/d/msgid/elasticsearch/28d54f90-f6b8-449a-806f-e873600dfdd5%40googlegroups.com?utm_medium=email&amp;utm_source=footer" target="_blank">https://groups.google.com/d/msgid/elasticsearch/28d54f90-f6b8-449a-806f-e873600dfdd5%40googlegroups.com</a>.


            For more options, visit <a moz-do-not-send="true" href="https://groups.google.com/d/optout" target="_blank">https://groups.google.com/d/optout</a>.
  -- 


  You received this message because you are subscribed to a topic in
  the Google Groups "elasticsearch" group.


  To unsubscribe from this topic, visit <a moz-do-not-send="true" href="https://groups.google.com/d/topic/elasticsearch/rnusTTvNTfg/unsubscribe">https://groups.google.com/d/topic/elasticsearch/rnusTTvNTfg/unsubscribe</a>.


  To unsubscribe from this group and all its topics, send an email
  to <a moz-do-not-send="true" href="mailto:elasticsearch+unsubscribe@googlegroups.com">elasticsearch+unsubscribe@googlegroups.com</a>.


  To view this discussion on the web visit <a moz-do-not-send="true" href="https://groups.google.com/d/msgid/elasticsearch/CAKdsXoHwtoUKSN%2B-D0jiPrVHaMoUygWTtgR-%3Di84Mn0jPCZSYw%40mail.gmail.com?utm_medium=email&amp;utm_source=footer">https://groups.google.com/d/msgid/elasticsearch/CAKdsXoHwtoUKSN%2B-D0jiPrVHaMoUygWTtgR-%3Di84Mn0jPCZSYw%40mail.gmail.com</a>.


  For more options, visit <a moz-do-not-send="true" href="https://groups.google.com/d/optout">https://groups.google.com/d/optout</a>.