Hi Shay,
I have a few bits of suggestions on how the bulk API can be. First of all,
the data that needs to be passed in bulk API request is not a proper json,
but a string concatenation of proper jsons separated by newlines. Why it
can't be a proper json array? That would help in proper creation/
modification of the data part using jackson or similar json libraries. Also
that would overcome the current limitation of not having newlines in the
json data part.
I would like to suggest a bulk API request data format like following ....
{
"common": {
"action": {
"index": {
"_index": "lp",
"_type": "6"
}
},
"body": {
"field1": "value1"
}
},
"individual": [
{
"action": {
"update": {
"_index": "lp",
"_type": "6",
"_id": "66207"
}
},
"body": {
"script": "ctx._source.intvalue= ctx._source.intvalue + 1"
}
},
{
"action": {
"index": {
"_index": "lp",
"_type": "6",
"_id": "test1"
}
},
"body": {
"field1": "value1"
}
},
{
"action": {
"index": {
"_index": "lp",
"_type": "6",
"_id": "test2"
}
},
"body": {
"field1": "value1"
}
},
{
"action": {
"index": {
"_index": "lp",
"_type": "6",
"_id": "id13"
}
}
},
{
"action": {
"index": {
"_index": "lp",
"_type": "6",
"_id": "id14"
}
}
},
{
"body": {
"field2": "test value 2"
}
},
{
"body": {
"field2": "test value 3"
}
}
]
}
The 'common' section contains the default 'action' and 'body', which will
ONLY be applied if either of these blocks are missing for any element of
the 'individual' section.
In the above example, the first three elements of the 'individual' section
are self-complete with both 'action' and 'body' block.
The 4th and 5th element of the 'individual' block are lacking the 'body'
block, so the 'body' block from 'common' section will be applied here with
the specified 'action' blocks.
Similarly, the 6th and 7th element of the 'individual' section are lacking
the 'action' block, so the 'action' block from 'common' section will be
applied here with the specified 'body' blocks.
Makes sense?
Thanks and Regards,
Sujoy.
On Wednesday, May 30, 2012 10:53:19 AM UTC+5:30, jagdeep wrote:
Hey Shay,
Do you have any plan to incorporate this in future release ?
Regards
Jagdeep
On Wednesday, May 30, 2012 3:21:43 AM UTC+5:30, kimchy wrote:
It makes sense to have a bulk API for update, the tricky bit here is to
create a nice API design so if one uses the same script, you don't have to
repeat it for each item.
On Tue, May 29, 2012 at 8:58 AM, Sujoy Sett sujoysett@gmail.com wrote:
Thanks Benjamin.
I have opened an issue for it -
using bulk API with update (using scripts) in elasticsearch · Issue #1985 · elastic/elasticsearch · GitHub
Regards,
Sujoy.
On Monday, May 28, 2012 10:22:37 PM UTC+5:30, Benjamin Devèze wrote:
Bulk API does not support update requests currently, just index and
delete.
I think that if you open an issue for it on github it could be added.
On Mon, May 28, 2012 at 1:30 PM, Sujoy Sett sujoysett@gmail.com
wrote:
Also, if I try to execute only a single update query via bulk
API, ActionRequestValidationException[Validation Failed: 1: no
requests
added;] is returned.
REQUEST
curl -XPOST 'http://localhost:9200/_bulk' -d '{ "update" : {
"_index" :
"lp", "_type" : "6", "_id" : "66207" } }
{ "script" : "ctx._source.notesCount = ctx._source.notesCount + 1"}
{}'
RESPONSE
{"error":"ActionRequestValidationException[Validation Failed: 1:
no requests
added;]","status":500}
Does that mean that unlike 'index' or 'delete', update is not a valid
action
for bulk API? Has anyone faced this issue?
Regards,
Sujoy.
On Monday, May 28, 2012 2:19:32 PM UTC+5:30, Sujoy Sett wrote:
Hi,
Does the update API works with the bulk endpoint in ES? I am not
being
able to make it work. The update thing works separately as follows :
REQUEST
curl -XPOST 'http://localhost:9200/lp/6/**66207/_updatehttp://localhost:9200/lp/6/66207/_update'
-d '{ "script" :
"ctx._source.notesCount = ctx._source.notesCount + 1"}'
RESPONSE
{"ok":true,"index":"lp","**type":"6","id":"66207","**version":6}
But when using with the bulk API, other actions execute, but not the
update.
REQUEST
curl -XPOST 'http://localhost:9200/_bulk' -d '{ "update" : {
"_index" :
"lp", "_type" : "6", "_id" : "66207" } }
{ "script" : "ctx._source.notesCount = ctx._source.notesCount + 1"}
{ "index" : { "_index" : "lp", "_type" : "6", "_id" : "test1" } }
{ "field1" : "value1" }
{ "index" : { "_index" : "lp", "_type" : "6", "_id" : "test2" } }
{ "field1" : "value1" }
{}'
RESPONSE
{"took":31,"items":[{"index":{"_index":"lp","type":"6","
id":"test1","_version":29,"ok":true}},{"index":{"_index":"
lp","_type":"6","_id":"test2",**"_version":15,"ok":true}}]}
There is response for the two index queries, but not for the update
query.
The response shows no error either. What am I doing wrong? This is
creating
a roadblock in our development. Please help.
Thanks and Regards,
Sujoy.
--
Benjamin DEVEZE