Bulk - insert if not exists, update otherwise?


(Christopher Ambler-2) #1

I have a situation where I have code that does 3000 bulk inserts of data.
This works just fine.

What I now need to do is 3000 bulk inserts OR updates.

Specifically, if the key already exists, I need to UPDATE fields A, B and
C. If the key does NOT exists, I need to INSERT all fields. But again, if
I'm doing the UPDATE, it's just modifying the values of three fields and I
want to leave the others alone.

I can easily see how I'd do this in two API calls if I were doing each item
at a time. A HEAD on the key to see if it exists, and then either the
INSERT or UPDATE depending on the result.

But is there a way to do this in the bulk API for efficiency? Doing it one
at a time will be orders of magnitude slower.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/e86adebf-cfef-4c91-97df-9613cb386ab1%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


(eunever32) #2

What you're describing is the upsert functionality in the mvel scripting.

The upsert will create and populate when the key doesn't exist.

And the update api will add to the document if it does already exist.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/9ccaa381-2c3d-440e-9366-28645b22710e%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


(Christopher Ambler-2) #3

Excellent - are there any examples I can see? Where can I read and learn
how to do this?

On Monday, August 11, 2014 10:19:51 AM UTC-7, eune...@gmail.com wrote:

What you're describing is the upsert functionality in the mvel scripting.

The upsert will create and populate when the key doesn't exist.

And the update api will add to the document if it does already exist.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/c19a03d4-8729-473e-bf71-8ede73b9ddda%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


(Christopher Ambler-2) #4

Wait, hang on - I just saw this in the docs. Are you suggesting a solution
that's being deprecated? If so, that's likely not a good idea.

I'm now confused :wink:

[image: Warning]
Deprecated in 1.3.0.

Mvel has been deprecated and will be removed in 1.4.0.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/a4d85bb9-c9ed-4470-bc54-0a28b8f36bad%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


(eunever32) #5

I see what you mean about MVEL

I guess the same functionality is available in the replacement. Groovy.

I will need to investigate switching to Groovy.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/6d83cbd5-3970-455d-bfc4-5a5afd6332ee%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


(Christopher Ambler-2) #6

Okay, so now here's where I am - I read up on upsert and crafted my bulk
stack. It seems to work:

{"update":{"_index":"aftermarket-2014-08-11_02-38-19","_type":"premium","_id":"kryptonblue.com"}}
{"script" : "ctx._source.auctionid=6623102; ctx._source.auctiontype=18; ctx._source.auctionstatus=4; ctx._source.auctionprice=4488; ctx._source.auctionendtime='Oct 7 2014 09:10:00:000AM'; ctx._source.auctionadult=false;", "upsert": { "auctionid": 6623102, "auctiontype": 18, "auctionstatus": 4, "auctionprice": 4488, "auctionendtime": "Oct 7 2014 09:10:00:000AM", "auctionadult": false, "domaintype": "auction", "sld": "kryptonblue", "tld": "com", "vendorid": 0, "price": 0, "commissionrate": 0, "isfasttransfer": false, "tokens": ["krypton","blue"]}}

This seems to do the right thing. Here's the result I get back:

        [0] => stdClass Object
            (
                [update] => stdClass Object
                    (
                        [_index] => aftermarket-2014-08-11_02-38-19
                        [_type] => premium
                        [_id] => kryptonblue.com
                        [_version] => 1
                        [status] => 201
                    )

            )

This was done as an insert - there was nothing to update. So 201 seems right. What can I expect on an update? A straight 200?

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/7adfadae-d06e-4523-b860-a2d3e2cba1c4%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


(eunever32) #7

I think you're on the right track. If you just run again you should get the update.

Does the document appear correct?

Note new option in 1.4 scripted_upsert true
Allows the document to be sent once for efficiency.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/b3244beb-d46f-4517-b6ad-70dd895a48d8%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


(system) #8