Bulk update not updating some entries


(William Wong) #1

Hi all,

I'm using bulk API to update a lot of entires generated from logstash. In
each response, all items are checked with "ok":true. I notice sometimes
some of the entires are actually not updated. Is there anyway to track
down what is happening? Thanks.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.


(Alexander Reelsen) #2

Hey,

can you recreate the bulk requests and see if this also happens, when you
do it manually?

How do you notice that the entries are not updated? It might take up to one
second (by default, the parameter is called refresh_interval) until the
indexed documents are available for search.

Also, where did you get the 'ok: true' response from? From the complete
bulk request or did you check each response?

Any exceptions in the log files?

--Alex

On Mon, Oct 28, 2013 at 2:54 AM, William Wong rightedges@gmail.com wrote:

Hi all,

I'm using bulk API to update a lot of entires generated from logstash. In
each response, all items are checked with "ok":true. I notice sometimes
some of the entires are actually not updated. Is there anyway to track
down what is happening? Thanks.

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.


(William Wong) #3

Hi Alex,

Thanks for your response. Please see my inline answer.

On Mon, Oct 28, 2013 at 5:36 PM, Alexander Reelsen alr@spinscale.de wrote:

Hey,

can you recreate the bulk requests and see if this also happens, when you
do it manually?

WW> Yes, I can recreate the bulk requests and like I said before this
doesn't always happen. But, when it does, some entries are left the same.

How do you notice that the entries are not updated? It might take up to one

second (by default, the parameter is called refresh_interval) until the
indexed documents are available for search.

WW> After each update, I checked all the entries again after a while. My
last test, after 5 minutes, some entries are still unchanged. I also count
the number of ok:true (see below) to ensure they match up.

Also, where did you get the 'ok: true' response from? From the complete
bulk request or did you check each response?

WW> I got the response ok in the bulk request response. They look like
below. Is that the right parameter to check?

{"took":19,"items":[{"update":{"_index":"logstash-2013.10.25","_type":"logs","_id":"ohab8v8fQgShcENjsniNDg","_version":40,"ok":true}},{"update":{"_index":"logstash-2013.10.25","_type":"logs","_id":"0FpjhcyLQZuDxcjNkiV5UQ","_version":40,"ok":true}},{"update":{"_index":"logstash-2013.10.25","_type":"logs","_id":"PFslcv7ARnWo6xF84UrjUA","_version":40,"ok":true}},{"update":{"_index":"logstash-2013.10.25","_type":"logs","_id":"aXiHpY0yRueUIH420uoLcg","_version":40,"ok":true}},{"update":{"_index":"logstash-2013.10.25","_type":"logs","_id":"n1_5hgHHTR-i1aqPnGk5ZQ","_version":40,"ok":true}},

Any exceptions in the log files?

WW> No exceptions in the log and that's why I have no clue why this is
happening. Any debugging flag?

This problem happens when my change is large. In my test, if my change
batch has 80000+ entries, it will happen. However, I already divide each
bulk request to 500 entries per bulk call (they go like this, 1. 0-500, 2.
501-1000, 3. 1001-1500..etc). Not sure whether they are related.

--Alex

On Mon, Oct 28, 2013 at 2:54 AM, William Wong rightedges@gmail.comwrote:

Hi all,

I'm using bulk API to update a lot of entires generated from logstash.
In each response, all items are checked with "ok":true. I notice sometimes
some of the entires are actually not updated. Is there anyway to track
down what is happening? Thanks.

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.

For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to a topic in the
Google Groups "elasticsearch" group.
To unsubscribe from this topic, visit
https://groups.google.com/d/topic/elasticsearch/-NpATLplzi8/unsubscribe.
To unsubscribe from this group and all its topics, send an email to
elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.


(William Wong) #4

I found out the root cause of this issue. When I first generate the update
list, I pull all ids using the size option. This means ids could
potentially duplicated. So, some entries are updated more than once and
some are not changed. Thanks all!

On Monday, October 28, 2013 7:06:11 PM UTC+8, William Wong wrote:

Hi Alex,

Thanks for your response. Please see my inline answer.

On Mon, Oct 28, 2013 at 5:36 PM, Alexander Reelsen alr@spinscale.dewrote:

Hey,

can you recreate the bulk requests and see if this also happens, when you
do it manually?

WW> Yes, I can recreate the bulk requests and like I said before this
doesn't always happen. But, when it does, some entries are left the same.

How do you notice that the entries are not updated? It might take up to

one second (by default, the parameter is called refresh_interval) until the
indexed documents are available for search.

WW> After each update, I checked all the entries again after a while. My
last test, after 5 minutes, some entries are still unchanged. I also count
the number of ok:true (see below) to ensure they match up.

Also, where did you get the 'ok: true' response from? From the complete
bulk request or did you check each response?

WW> I got the response ok in the bulk request response. They look like
below. Is that the right parameter to check?

{"took":19,"items":[{"update":{"_index":"logstash-2013.10.25","_type":"logs","_id":"ohab8v8fQgShcENjsniNDg","_version":40,"ok":true}},{"update":{"_index":"logstash-2013.10.25","_type":"logs","_id":"0FpjhcyLQZuDxcjNkiV5UQ","_version":40,"ok":true}},{"update":{"_index":"logstash-2013.10.25","_type":"logs","_id":"PFslcv7ARnWo6xF84UrjUA","_version":40,"ok":true}},{"update":{"_index":"logstash-2013.10.25","_type":"logs","_id":"aXiHpY0yRueUIH420uoLcg","_version":40,"ok":true}},{"update":{"_index":"logstash-2013.10.25","_type":"logs","_id":"n1_5hgHHTR-i1aqPnGk5ZQ","_version":40,"ok":true}},

Any exceptions in the log files?

WW> No exceptions in the log and that's why I have no clue why this is
happening. Any debugging flag?

This problem happens when my change is large. In my test, if my change
batch has 80000+ entries, it will happen. However, I already divide each
bulk request to 500 entries per bulk call (they go like this, 1. 0-500, 2.
501-1000, 3. 1001-1500..etc). Not sure whether they are related.

--Alex

On Mon, Oct 28, 2013 at 2:54 AM, William Wong rightedges@gmail.comwrote:

Hi all,

I'm using bulk API to update a lot of entires generated from logstash.
In each response, all items are checked with "ok":true. I notice sometimes
some of the entires are actually not updated. Is there anyway to track
down what is happening? Thanks.

--
You received this message because you are subscribed to the Google
Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send
an email to elasticsearch+unsubscribe@googlegroups.com.

For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to a topic in the
Google Groups "elasticsearch" group.
To unsubscribe from this topic, visit
https://groups.google.com/d/topic/elasticsearch/-NpATLplzi8/unsubscribe.
To unsubscribe from this group and all its topics, send an email to
elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.


(system) #5