Benefits of using bulk with node client?

rore · November 10, 2014, 6:34am

I can definitely see the point of using the bulk API when indexing via
HTTP.

But is there an advantage of using bulk instead of individual index request
when using the client node? Since the node parses the bulk and routes each
request to its proper destination - and it's basically doing the same when
you submit individual requests - what is the benefit of doing a bulk
request in this case?

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/28d54f90-f6b8-449a-806f-e873600dfdd5%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

dadoonet · November 10, 2014, 6:53am

Definitely !

Try with and without and you will see the difference.

The node does not parse the full doc but only headers and streams your docs to the right shards.

I noticed myself a huge difference between both.

--
David
Twitter : @dadoonet / @elasticsearchfr / @scrutmydocs

Le 10 nov. 2014 à 07:34, Rotem rotem.hermon@gmail.com a écrit :

I can definitely see the point of using the bulk API when indexing via HTTP.

But is there an advantage of using bulk instead of individual index request when using the client node? Since the node parses the bulk and routes each request to its proper destination - and it's basically doing the same when you submit individual requests - what is the benefit of doing a bulk request in this case?

You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/28d54f90-f6b8-449a-806f-e873600dfdd5%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/1C44D026-16C2-42BD-A531-5B79DF49D91B%40pilato.fr.
For more options, visit https://groups.google.com/d/optout.

rore · November 10, 2014, 6:58am

Ah, thanks, that's helpful to know!


But if that's the case - why does the node need to parse the
document when doing an individual request and not bulk? It can also
stream the doc to the right shard based on the meta data (id and
routing) without parsing the doc, same as it does with the bulk API,
no?

On 11/10/2014 8:53, David Pilato wrote:

Definitely !

Try with and without and you will see the difference.

The node does not parse the full doc but only headers and
streams your docs to the right shards.

I noticed myself a huge difference between both.

--
David
Twitter : @dadoonet / @elasticsearchfr / @scrutmydocs

    Le 10 nov. 2014 à 07:34, Rotem &lt;<a moz-do-not-send="true" href="mailto:rotem.hermon@gmail.com">rotem.hermon@gmail.com</a>&gt;
    a écrit :

I can definitely see the point of using the bulk API when indexing via HTTP.

        But is there an advantage of using bulk instead of
        individual index request when using the client node? Since
        the node parses the bulk and routes each request to its
        proper destination - and it's basically doing the same when
        you submit individual requests - what is the benefit of
        doing a bulk request in this case?


      -- 


      You received this message because you are subscribed to the
      Google Groups "elasticsearch" group.


      To unsubscribe from this group and stop receiving emails from
      it, send an email to <a moz-do-not-send="true" href="mailto:elasticsearch+unsubscribe@googlegroups.com">elasticsearch+unsubscribe@googlegroups.com</a>.


      To view this discussion on the web visit <a moz-do-not-send="true" href="https://groups.google.com/d/msgid/elasticsearch/28d54f90-f6b8-449a-806f-e873600dfdd5%40googlegroups.com?utm_medium=email&amp;utm_source=footer">https://groups.google.com/d/msgid/elasticsearch/28d54f90-f6b8-449a-806f-e873600dfdd5%40googlegroups.com</a>.


      For more options, visit <a moz-do-not-send="true" href="https://groups.google.com/d/optout">https://groups.google.com/d/optout</a>.

--

  You received this message because you are subscribed to a topic in
  the Google Groups "elasticsearch" group.


  To unsubscribe from this topic, visit <a moz-do-not-send="true" href="https://groups.google.com/d/topic/elasticsearch/rnusTTvNTfg/unsubscribe">https://groups.google.com/d/topic/elasticsearch/rnusTTvNTfg/unsubscribe</a>.


  To unsubscribe from this group and all its topics, send an email
  to <a moz-do-not-send="true" href="mailto:elasticsearch+unsubscribe@googlegroups.com">elasticsearch+unsubscribe@googlegroups.com</a>.


  To view this discussion on the web visit <a moz-do-not-send="true" href="https://groups.google.com/d/msgid/elasticsearch/1C44D026-16C2-42BD-A531-5B79DF49D91B%40pilato.fr?utm_medium=email&amp;utm_source=footer">https://groups.google.com/d/msgid/elasticsearch/1C44D026-16C2-42BD-A531-5B79DF49D91B%40pilato.fr</a>.


  For more options, visit <a moz-do-not-send="true" href="https://groups.google.com/d/optout">https://groups.google.com/d/optout</a>.

jprante · November 10, 2014, 8:15am

The node does not parse the bulk, only part of it (the metadata lines for
hashing and routing).

The benefit of bulk requests are simple to see on the network layer.

Assume 1000 docs:

without bulk, send a request per doc, and wait for response each doc:
client must submit 1000 packets on the wire, and server must submit 1000
responses on the wire back, and for each doc on inner shard level
send/receive cycle, there are also another 1000 send/receive. Makes around
4000 packets on the wire (worst case is the connected server node does not
hold a shard of the index), with all the delays.
with bulk, client submits 1 request, server submits subpackets to each
node that holds a shard of the index and submits 1 response back. Makes
around 1 + (2 * n) + 1 packets where n is the number of nodes. With 3
nodes, you have 8 packets instead of 4000.

Same holds for both HTTP and transport protocol, HTTP is only used for
accepting client requests.

Jörg

On Mon, Nov 10, 2014 at 7:34 AM, Rotem rotem.hermon@gmail.com wrote:

I can definitely see the point of using the bulk API when indexing via
HTTP.

But is there an advantage of using bulk instead of individual index
request when using the client node? Since the node parses the bulk and
routes each request to its proper destination - and it's basically doing
the same when you submit individual requests - what is the benefit of doing
a bulk request in this case?

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/28d54f90-f6b8-449a-806f-e873600dfdd5%40googlegroups.com
https://groups.google.com/d/msgid/elasticsearch/28d54f90-f6b8-449a-806f-e873600dfdd5%40googlegroups.com?utm_medium=email&utm_source=footer
.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAKdsXoHwtoUKSN%2B-D0jiPrVHaMoUygWTtgR-%3Di84Mn0jPCZSYw%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

rore · November 10, 2014, 8:31am

Makes sense. Thanks!

On 11/10/2014 10:15,
joergprante@gmail.com wrote:

The node does not parse the bulk, only part of it (the metadata lines for hashing and routing).
The benefit of bulk requests are simple to see on the
network layer.

Assume 1000 docs:

without bulk, send a request per doc, and wait for
response each doc: client must submit 1000 packets on the
wire, and server must submit 1000 responses on the wire
back, and for each doc on inner shard level send/receive
cycle, there are also another 1000 send/receive. Makes
around 4000 packets on the wire (worst case is the connected
server node does not hold a shard of the index), with all
the delays.

with bulk, client submits 1 request, server submits
subpackets to each node that holds a shard of the index and
submits 1 response back. Makes around 1 + (2 * n) + 1
packets where n is the number of nodes. With 3 nodes, you
have 8 packets instead of 4000.

Same holds for both HTTP and transport protocol, HTTP is
only used for accepting client requests.

Jörg

On Mon, Nov 10, 2014 at 7:34 AM, Rotem
<rotem.hermon@gmail.com>
wrote:
I can definitely see the point of using the bulk API when indexing via HTTP.
          But is there an advantage of using bulk instead of
          individual index request when using the client node? Since
          the node parses the bulk and routes each request to its
          proper destination - and it's basically doing the same
          when you submit individual requests - what is the benefit
          of doing a bulk request in this case?


            -- 


            You received this message because you are subscribed to
            the Google Groups "elasticsearch" group.


            To unsubscribe from this group and stop receiving emails
            from it, send an email to <a moz-do-not-send="true" href="mailto:elasticsearch+unsubscribe@googlegroups.com" target="_blank">elasticsearch+unsubscribe@googlegroups.com</a>.


            To view this discussion on the web visit <a moz-do-not-send="true" href="https://groups.google.com/d/msgid/elasticsearch/28d54f90-f6b8-449a-806f-e873600dfdd5%40googlegroups.com?utm_medium=email&amp;utm_source=footer" target="_blank">https://groups.google.com/d/msgid/elasticsearch/28d54f90-f6b8-449a-806f-e873600dfdd5%40googlegroups.com</a>.


            For more options, visit <a moz-do-not-send="true" href="https://groups.google.com/d/optout" target="_blank">https://groups.google.com/d/optout</a>.
  -- 


  You received this message because you are subscribed to a topic in
  the Google Groups "elasticsearch" group.


  To unsubscribe from this topic, visit <a moz-do-not-send="true" href="https://groups.google.com/d/topic/elasticsearch/rnusTTvNTfg/unsubscribe">https://groups.google.com/d/topic/elasticsearch/rnusTTvNTfg/unsubscribe</a>.


  To unsubscribe from this group and all its topics, send an email
  to <a moz-do-not-send="true" href="mailto:elasticsearch+unsubscribe@googlegroups.com">elasticsearch+unsubscribe@googlegroups.com</a>.


  To view this discussion on the web visit <a moz-do-not-send="true" href="https://groups.google.com/d/msgid/elasticsearch/CAKdsXoHwtoUKSN%2B-D0jiPrVHaMoUygWTtgR-%3Di84Mn0jPCZSYw%40mail.gmail.com?utm_medium=email&amp;utm_source=footer">https://groups.google.com/d/msgid/elasticsearch/CAKdsXoHwtoUKSN%2B-D0jiPrVHaMoUygWTtgR-%3Di84Mn0jPCZSYw%40mail.gmail.com</a>.


  For more options, visit <a moz-do-not-send="true" href="https://groups.google.com/d/optout">https://groups.google.com/d/optout</a>.

Topic		Replies	Views
Is bulk index sending to data nodes better or non-data nodes? Elasticsearch	3	1465	July 6, 2017
Bulk request routing Elasticsearch	2	1041	July 6, 2017
Java Bulk Request advantage over individual Request Elasticsearch	2	617	July 5, 2017
Transport Client VS REST Client Elasticsearch	4	7450	July 6, 2017
Result Bulk indexing Elasticsearch	4	341	July 6, 2017

Benefits of using bulk with node client?

Related topics