Downside to using Bulk API for small/single-doc sets?

Nikita_Tovstoles · July 2, 2014, 10:40pm

Hi,

I am using ES Java API to talk to an ES server. Sometimes I need to index a
single doc, sometimes dozens or hundreds at a time. I'd prefer to keep my
code simple (am a contrarian thinker) and wonder if I can get away with
always using bulk API (ie BulkRequestBuilder). so that my interface to ES
would look like so:

void indexDoc(Doc doc);
void indexDocs(Collection docs);

...but impl would always delegate to BulkRequestBuilder - with number of
actions sometimes being ~ 1.

Is there a performance (or other) downside to this approach. Specifically,
would bulk index updates (with set of size == 1) take significantly longer
than non-bulk updates?

thanks,
-nikita

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/9a915ef3-812b-4905-8e4e-852aeb43a81c%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

jprante · July 3, 2014, 8:35pm

A question back: do you observe a significant difference?

Jörg

On Thu, Jul 3, 2014 at 12:40 AM, Nikita Tovstoles <
nikita.tovstoles@gmail.com> wrote:

Hi,

I am using ES Java API to talk to an ES server. Sometimes I need to index
a single doc, sometimes dozens or hundreds at a time. I'd prefer to keep my
code simple (am a contrarian thinker) and wonder if I can get away with
always using bulk API (ie BulkRequestBuilder). so that my interface to ES
would look like so:

void indexDoc(Doc doc);
void indexDocs(Collection docs);

...but impl would always delegate to BulkRequestBuilder - with number of
actions sometimes being ~ 1.

Is there a performance (or other) downside to this approach. Specifically,
would bulk index updates (with set of size == 1) take significantly longer
than non-bulk updates?

thanks,
-nikita

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/9a915ef3-812b-4905-8e4e-852aeb43a81c%40googlegroups.com
https://groups.google.com/d/msgid/elasticsearch/9a915ef3-812b-4905-8e4e-852aeb43a81c%40googlegroups.com?utm_medium=email&utm_source=footer
.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAKdsXoF9tztn6o0KvLYnA74-YW3xHW6BXGCe9sFnq1euLZ6g4A%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

Nikita_Tovstoles · July 3, 2014, 8:40pm

I do not. But I also do not if my tests recreate conditions where
differences may surface.

Nikita
On Jul 3, 2014 1:35 PM, "joergprante@gmail.com" joergprante@gmail.com
wrote:

A question back: do you observe a significant difference?

Jörg

On Thu, Jul 3, 2014 at 12:40 AM, Nikita Tovstoles <
nikita.tovstoles@gmail.com> wrote:

Hi,

I am using ES Java API to talk to an ES server. Sometimes I need to index
a single doc, sometimes dozens or hundreds at a time. I'd prefer to keep my
code simple (am a contrarian thinker) and wonder if I can get away with
always using bulk API (ie BulkRequestBuilder). so that my interface to ES
would look like so:

void indexDoc(Doc doc);
void indexDocs(Collection docs);

...but impl would always delegate to BulkRequestBuilder - with number of
actions sometimes being ~ 1.

Is there a performance (or other) downside to this approach.
Specifically, would bulk index updates (with set of size == 1) take
significantly longer than non-bulk updates?

thanks,
-nikita

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/9a915ef3-812b-4905-8e4e-852aeb43a81c%40googlegroups.com
https://groups.google.com/d/msgid/elasticsearch/9a915ef3-812b-4905-8e4e-852aeb43a81c%40googlegroups.com?utm_medium=email&utm_source=footer
.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to a topic in the
Google Groups "elasticsearch" group.
To unsubscribe from this topic, visit
https://groups.google.com/d/topic/elasticsearch/TTrkX4a8YFw/unsubscribe.
To unsubscribe from this group and all its topics, send an email to
elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/CAKdsXoF9tztn6o0KvLYnA74-YW3xHW6BXGCe9sFnq1euLZ6g4A%40mail.gmail.com
https://groups.google.com/d/msgid/elasticsearch/CAKdsXoF9tztn6o0KvLYnA74-YW3xHW6BXGCe9sFnq1euLZ6g4A%40mail.gmail.com?utm_medium=email&utm_source=footer
.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAJwaA20L8VmeTiRHq9cfKYznx7UkKbRuQRr7bsbEPwx4j03YFg%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

jprante · July 3, 2014, 9:18pm

Yes, the difference is not noticeable. Bulk requests have an extra logic
for shard splitting, with a single index request, there is no overhead at
all.

Jörg

On Thu, Jul 3, 2014 at 10:40 PM, Nikita Tovstoles <
nikita.tovstoles@gmail.com> wrote:

I do not. But I also do not if my tests recreate conditions where
differences may surface.

Nikita
On Jul 3, 2014 1:35 PM, "joergprante@gmail.com" joergprante@gmail.com
wrote:

A question back: do you observe a significant difference?

Jörg

On Thu, Jul 3, 2014 at 12:40 AM, Nikita Tovstoles <
nikita.tovstoles@gmail.com> wrote:

Hi,

I am using ES Java API to talk to an ES server. Sometimes I need to
index a single doc, sometimes dozens or hundreds at a time. I'd prefer to
keep my code simple (am a contrarian thinker) and wonder if I can get away
with always using bulk API (ie BulkRequestBuilder). so that my interface to
ES would look like so:

void indexDoc(Doc doc);
void indexDocs(Collection docs);

...but impl would always delegate to BulkRequestBuilder - with number of
actions sometimes being ~ 1.

Is there a performance (or other) downside to this approach.
Specifically, would bulk index updates (with set of size == 1) take
significantly longer than non-bulk updates?

thanks,
-nikita

--
You received this message because you are subscribed to the Google
Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send
an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/9a915ef3-812b-4905-8e4e-852aeb43a81c%40googlegroups.com
https://groups.google.com/d/msgid/elasticsearch/9a915ef3-812b-4905-8e4e-852aeb43a81c%40googlegroups.com?utm_medium=email&utm_source=footer
.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to a topic in the
Google Groups "elasticsearch" group.
To unsubscribe from this topic, visit
https://groups.google.com/d/topic/elasticsearch/TTrkX4a8YFw/unsubscribe.
To unsubscribe from this group and all its topics, send an email to
elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/CAKdsXoF9tztn6o0KvLYnA74-YW3xHW6BXGCe9sFnq1euLZ6g4A%40mail.gmail.com
https://groups.google.com/d/msgid/elasticsearch/CAKdsXoF9tztn6o0KvLYnA74-YW3xHW6BXGCe9sFnq1euLZ6g4A%40mail.gmail.com?utm_medium=email&utm_source=footer
.

For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/CAJwaA20L8VmeTiRHq9cfKYznx7UkKbRuQRr7bsbEPwx4j03YFg%40mail.gmail.com
https://groups.google.com/d/msgid/elasticsearch/CAJwaA20L8VmeTiRHq9cfKYznx7UkKbRuQRr7bsbEPwx4j03YFg%40mail.gmail.com?utm_medium=email&utm_source=footer
.

For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAKdsXoEzrjqAqdoSKNY2BNA7%3Dg6pmPorq3XXL9sHYvLOZ%2B%2BuoQ%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

Nikita_Tovstoles · July 3, 2014, 9:23pm

Thanks. Makes sense
On Jul 3, 2014 2:18 PM, "joergprante@gmail.com" joergprante@gmail.com
wrote:

Yes, the difference is not noticeable. Bulk requests have an extra logic
for shard splitting, with a single index request, there is no overhead at
all.

Jörg

On Thu, Jul 3, 2014 at 10:40 PM, Nikita Tovstoles <
nikita.tovstoles@gmail.com> wrote:

I do not. But I also do not if my tests recreate conditions where
differences may surface.

Nikita
On Jul 3, 2014 1:35 PM, "joergprante@gmail.com" joergprante@gmail.com
wrote:

A question back: do you observe a significant difference?

Jörg

On Thu, Jul 3, 2014 at 12:40 AM, Nikita Tovstoles <
nikita.tovstoles@gmail.com> wrote:

Hi,

I am using ES Java API to talk to an ES server. Sometimes I need to
index a single doc, sometimes dozens or hundreds at a time. I'd prefer to
keep my code simple (am a contrarian thinker) and wonder if I can get away
with always using bulk API (ie BulkRequestBuilder). so that my interface to
ES would look like so:

void indexDoc(Doc doc);
void indexDocs(Collection docs);

...but impl would always delegate to BulkRequestBuilder - with number
of actions sometimes being ~ 1.

Is there a performance (or other) downside to this approach.
Specifically, would bulk index updates (with set of size == 1) take
significantly longer than non-bulk updates?

thanks,
-nikita

--
You received this message because you are subscribed to the Google
Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send
an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/9a915ef3-812b-4905-8e4e-852aeb43a81c%40googlegroups.com
https://groups.google.com/d/msgid/elasticsearch/9a915ef3-812b-4905-8e4e-852aeb43a81c%40googlegroups.com?utm_medium=email&utm_source=footer
.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to a topic in the
Google Groups "elasticsearch" group.
To unsubscribe from this topic, visit
https://groups.google.com/d/topic/elasticsearch/TTrkX4a8YFw/unsubscribe.
To unsubscribe from this group and all its topics, send an email to
elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/CAKdsXoF9tztn6o0KvLYnA74-YW3xHW6BXGCe9sFnq1euLZ6g4A%40mail.gmail.com
https://groups.google.com/d/msgid/elasticsearch/CAKdsXoF9tztn6o0KvLYnA74-YW3xHW6BXGCe9sFnq1euLZ6g4A%40mail.gmail.com?utm_medium=email&utm_source=footer
.

For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/CAJwaA20L8VmeTiRHq9cfKYznx7UkKbRuQRr7bsbEPwx4j03YFg%40mail.gmail.com
https://groups.google.com/d/msgid/elasticsearch/CAJwaA20L8VmeTiRHq9cfKYznx7UkKbRuQRr7bsbEPwx4j03YFg%40mail.gmail.com?utm_medium=email&utm_source=footer
.

For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to a topic in the
Google Groups "elasticsearch" group.
To unsubscribe from this topic, visit
https://groups.google.com/d/topic/elasticsearch/TTrkX4a8YFw/unsubscribe.
To unsubscribe from this group and all its topics, send an email to
elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/CAKdsXoEzrjqAqdoSKNY2BNA7%3Dg6pmPorq3XXL9sHYvLOZ%2B%2BuoQ%40mail.gmail.com
https://groups.google.com/d/msgid/elasticsearch/CAKdsXoEzrjqAqdoSKNY2BNA7%3Dg6pmPorq3XXL9sHYvLOZ%2B%2BuoQ%40mail.gmail.com?utm_medium=email&utm_source=footer
.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAJwaA236d0Xtysk07FcjECAPiFhWdtdg6K4W0Sr36aWDNNJe0Q%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

Topic		Replies	Views
ElasticSearch Bulk API: Performance with smaller document upserts Elasticsearch	1	380	July 6, 2017
Java bulk API slows down if client is not closed and reopened Elasticsearch	9	520	July 6, 2017
Sometimes index docs very slow Elasticsearch	2	456	July 5, 2017
Bulk update performance Elasticsearch	1	915	January 9, 2019
BulkRequestBuilder is slow Elasticsearch	6	693	July 6, 2017

Downside to using Bulk API for small/single-doc sets?

Related topics