Checking results of bulk inserts, pre- and post-1.0


(axel) #1

Hi,

in my current code, I check "ok": true to see if all my inserts in a bulk
call succeeded.

Now, with 1.0, "ok" is gone. Is there a recommended way to check the
results of individual commands inside a bulk api call? Notably, inserts?

Preferably something that works both pre and post 1.0.

Thanks

Axel

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/1475f682-983c-4da8-8cec-7855bc717b69%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


(Brian Yoder) #2

Axel,

Well, I use the Java API and not the REST API. And while there is a lot
more code around this, here's the core of my bulk load process. I use the
BulkRequestBuilder (a new one for each set of bulk updates!) and not the
BulkProcessor because (a) this code was already written and works and (b)
it gives me some additional control. For instance, I can configure the
maximum error display so that, if I'm loading 1M documents but they all
fail with some stupid fault of mine, I can see the first 128 errors but not
get swamped with all 1 million since they are more than likely to be very
similar.

I first wrote this code for ES 0.19.4 and it has remained unchanged and
working perfectly up to and including ES 1.1.0:

BulkResponse bulkResponse = bulkRequest.execute().actionGet();
if (bulkResponse.hasFailures())
{
if (failedDocuments <= failedMsgLimit)
Sysprint.err.println("BULK WARN: Bulk load failues: "
+ bulkResponse.buildFailureMessage());

BulkItemResponse[] items = bulkResponse.getItems();
for (BulkItemResponse item : items)
{
if (item.isFailed())
{
/* Failure detected /
failedDocuments++;
incrementFailedOpType(item.getOpType());
if (failedDocuments <= failedMsgLimit)
{
Sysprint.err.println("BULK WARN: Bulk failure for "
+ item.getOpType() + ": " + item.getFailureMessage());
}
}
else
{
/
Success: Keep track of version numbers > 1 */
if (item.getVersion() > 1)
setVersionGtOne++;
}
}
}

Hope this helps!

Brian

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/7961fd8d-8b7e-431e-bce0-4ac16293b1d3%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


(axel) #3

Thanks for the response.

My code's in python on top of the rest api, I fail to map the java code
over :-/

Axel

On Friday, April 11, 2014 4:14:10 PM UTC+2, ax...@mozilla.com wrote:

Hi,

in my current code, I check "ok": true to see if all my inserts in a bulk
call succeeded.

Now, with 1.0, "ok" is gone. Is there a recommended way to check the
results of individual commands inside a bulk api call? Notably, inserts?

Preferably something that works both pre and post 1.0.

Thanks

Axel

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/869ecb66-2388-4a42-ac6d-f2adb3befd19%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


(Honza Král) #4

Hi axel,

If you are using python you can just use the python client
(elasticsearch-py) it will shield you from this. Just have a look at the
bulk and streaming_bulk helpers in the library.

Hope this helps,
Honza
On Apr 11, 2014 7:52 PM, axel@mozilla.com wrote:

Thanks for the response.

My code's in python on top of the rest api, I fail to map the java code
over :-/

Axel

On Friday, April 11, 2014 4:14:10 PM UTC+2, ax...@mozilla.com wrote:

Hi,

in my current code, I check "ok": true to see if all my inserts in a bulk
call succeeded.

Now, with 1.0, "ok" is gone. Is there a recommended way to check the
results of individual commands inside a bulk api call? Notably, inserts?

Preferably something that works both pre and post 1.0.

Thanks

Axel

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/869ecb66-2388-4a42-ac6d-f2adb3befd19%40googlegroups.comhttps://groups.google.com/d/msgid/elasticsearch/869ecb66-2388-4a42-ac6d-f2adb3befd19%40googlegroups.com?utm_medium=email&utm_source=footer
.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CABfdDiqyJVmP82F-Hqi3QmTvwCpFjPJ_SsMnJPvUqG5h5YbPMQ%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


(axel) #5

Hi Honza,

sadly, this doesn't seem to work.

Comparing the results from 0.90 vs 1.0.1 from the _bulk api:

{"took":345,"items":[{"index":{"_index":"test","_type":"type1","_id":"1","_version":1,"ok":true}},{"delete":{"_index":"test","_type":"type1","_id":"2","_version":1,"ok":true}},{"create":{"_index":"test","_type":"type1","_id":"3","_version":1,"ok":true}},{"update":{"_index":"index1","_type":"type1","_id":"1","error":"DocumentMissingException[[index1][-1]
[type1][1]: document missing]"}}]}

{"took":256,"errors":true,"items":[{"index":{"_index":"test","_type":"type1","_id":"1","_version":1,"status":201}},{"delete":{"_index":"test","_type":"type1","_id":"2","_version":1,"status":404,"found":false}},{"create":{"_index":"test","_type":"type1","_id":"3","_version":1,"status":201}},{"update":{"_index":"index1","_type":"type1","_id":"1","status":404,"error":"DocumentMissingException[[index1][-1]
[type1][1]: document missing]"}}]}

it seems that pre 1.0 ES doesn't send status, and
https://github.com/elasticsearch/elasticsearch-py/blob/master/elasticsearch/helpers/init.py#L108
backs the lacking status responses up with 500 error codes. So basically
all my inserts fails.

I like the API, though.

RFE: the docs should talk about the return values of APIs, I basically had
to trial and error those.

Axel

Am Samstag, 12. April 2014 02:03:24 UTC+2 schrieb Honza Král:

Hi axel,

If you are using python you can just use the python client
(elasticsearch-py) it will shield you from this. Just have a look at the
bulk and streaming_bulk helpers in the library.

Hope this helps,
Honza
On Apr 11, 2014 7:52 PM, <ax...@mozilla.com <javascript:>> wrote:

Thanks for the response.

My code's in python on top of the rest api, I fail to map the java code
over :-/

Axel

On Friday, April 11, 2014 4:14:10 PM UTC+2, ax...@mozilla.com wrote:

Hi,

in my current code, I check "ok": true to see if all my inserts in a
bulk call succeeded.

Now, with 1.0, "ok" is gone. Is there a recommended way to check the
results of individual commands inside a bulk api call? Notably, inserts?

Preferably something that works both pre and post 1.0.

Thanks

Axel

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearc...@googlegroups.com <javascript:>.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/869ecb66-2388-4a42-ac6d-f2adb3befd19%40googlegroups.comhttps://groups.google.com/d/msgid/elasticsearch/869ecb66-2388-4a42-ac6d-f2adb3befd19%40googlegroups.com?utm_medium=email&utm_source=footer
.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/2b3e1512-c9ff-420a-bbc8-97d4d8d078d0%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


(Honza Král) #6

Hi Axel,

unfortunately there is no code in python to shield you from the
incompatibilities. There are, however, two releases of
elasticsearch-py - 04.X and 1.0X. Use 0.4.X with elasticsearch 0.90.*
and 1.0.X with elasticsearch 1.*. That should get you what you need.

Honza

On Wed, Apr 23, 2014 at 6:43 PM, axel@mozilla.com wrote:

Hi Honza,

sadly, this doesn't seem to work.

Comparing the results from 0.90 vs 1.0.1 from the _bulk api:

{"took":345,"items":[{"index":{"_index":"test","_type":"type1","_id":"1","_version":1,"ok":true}},{"delete":{"_index":"test","_type":"type1","_id":"2","_version":1,"ok":true}},{"create":{"_index":"test","_type":"type1","_id":"3","_version":1,"ok":true}},{"update":{"_index":"index1","_type":"type1","_id":"1","error":"DocumentMissingException[[index1][-1]
[type1][1]: document missing]"}}]}

{"took":256,"errors":true,"items":[{"index":{"_index":"test","_type":"type1","_id":"1","_version":1,"status":201}},{"delete":{"_index":"test","_type":"type1","_id":"2","_version":1,"status":404,"found":false}},{"create":{"_index":"test","_type":"type1","_id":"3","_version":1,"status":201}},{"update":{"_index":"index1","_type":"type1","_id":"1","status":404,"error":"DocumentMissingException[[index1][-1]
[type1][1]: document missing]"}}]}

it seems that pre 1.0 ES doesn't send status, and
https://github.com/elasticsearch/elasticsearch-py/blob/master/elasticsearch/helpers/init.py#L108
backs the lacking status responses up with 500 error codes. So basically all
my inserts fails.

I like the API, though.

RFE: the docs should talk about the return values of APIs, I basically had
to trial and error those.

Axel

Am Samstag, 12. April 2014 02:03:24 UTC+2 schrieb Honza Král:

Hi axel,

If you are using python you can just use the python client
(elasticsearch-py) it will shield you from this. Just have a look at the
bulk and streaming_bulk helpers in the library.

Hope this helps,
Honza

On Apr 11, 2014 7:52 PM, ax...@mozilla.com wrote:

Thanks for the response.

My code's in python on top of the rest api, I fail to map the java code
over :-/

Axel

On Friday, April 11, 2014 4:14:10 PM UTC+2, ax...@mozilla.com wrote:

Hi,

in my current code, I check "ok": true to see if all my inserts in a
bulk call succeeded.

Now, with 1.0, "ok" is gone. Is there a recommended way to check the
results of individual commands inside a bulk api call? Notably, inserts?

Preferably something that works both pre and post 1.0.

Thanks

Axel

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearc...@googlegroups.com.

To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/869ecb66-2388-4a42-ac6d-f2adb3befd19%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/2b3e1512-c9ff-420a-bbc8-97d4d8d078d0%40googlegroups.com.

For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CABfdDip94aYFBi%3Dnho%3DT17i21AVkYTSgVM43%2BYe5pQ9XTzzrDQ%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


(system) #7