Update single field of a document


(Ridvan Gyundogan) #1

Is it possible to update a field of a document without sending the
whole document again.
I try something like the following but it just replaces the whole
document with this ""price1 field only:
#curl -XPUT 'http://localhost:9200/giftsplusindex/giftsplusproduct/530/
price1' -d '{
"price1" : "19.53"
}'


(Ivan Brusic) #2

Ridvan,

It is not possible to increment only certain fields, you must reindex the
entire document.

--
Ivan

On Fri, Aug 12, 2011 at 9:02 AM, Ridvan Gyundogan ridvansg@gmail.comwrote:

Is it possible to update a field of a document without sending the
whole document again.
I try something like the following but it just replaces the whole
document with this ""price1 field only:
#curl -XPUT 'http://localhost:9200/giftsplusindex/giftsplusproduct/530/
price1' -d '{
"price1" : "19.53"
}'


(Ridvan Gyundogan) #3

Hm, then this is my number 1 vote for change request.

On Aug 12, 5:00 pm, Ivan Brusic i...@brusic.com wrote:

Ridvan,

It is not possible to increment only certain fields, you must reindex the
entire document.

--
Ivan

On Fri, Aug 12, 2011 at 9:02 AM, Ridvan Gyundogan ridva...@gmail.comwrote:

Is it possible to update a field of a document without sending the
whole document again.
I try something like the following but it just replaces the whole
document with this ""price1 field only:
#curl -XPUT 'http://localhost:9200/giftsplusindex/giftsplusproduct/530/
price1' -d '{
"price1" : "19.53"
}'


(vineeth mohan) #4

I would also be happy to see this feature.

Thanks
Vineeth

On Sat, Aug 13, 2011 at 5:23 PM, Ridvan Gyundogan ridvansg@gmail.comwrote:

Hm, then this is my number 1 vote for change request.

On Aug 12, 5:00 pm, Ivan Brusic i...@brusic.com wrote:

Ridvan,

It is not possible to increment only certain fields, you must reindex the
entire document.

--
Ivan

On Fri, Aug 12, 2011 at 9:02 AM, Ridvan Gyundogan <ridva...@gmail.com
wrote:

Is it possible to update a field of a document without sending the
whole document again.
I try something like the following but it just replaces the whole
document with this ""price1 field only:
#curl -XPUT '
http://localhost:9200/giftsplusindex/giftsplusproduct/530/

price1' -d '{
"price1" : "19.53"
}'


(David Pilato) #5

I think you can easily handle it on your side.

  • ask ES to get your document ( Get /index/doc/1 )
  • Then modify your field
  • Then send back to ES the new version of the document (put /index/doc/1 )

Hope this helps
David :wink:

Le 13 août 2011 à 15:46, Vineeth Mohan vineethmohan@algotree.com a écrit :

I would also be happy to see this feature.

Thanks
Vineeth

On Sat, Aug 13, 2011 at 5:23 PM, Ridvan Gyundogan ridvansg@gmail.com wrote:
Hm, then this is my number 1 vote for change request.

On Aug 12, 5:00 pm, Ivan Brusic i...@brusic.com wrote:

Ridvan,

It is not possible to increment only certain fields, you must reindex the
entire document.

--
Ivan

On Fri, Aug 12, 2011 at 9:02 AM, Ridvan Gyundogan ridva...@gmail.comwrote:

Is it possible to update a field of a document without sending the
whole document again.
I try something like the following but it just replaces the whole
document with this ""price1 field only:
#curl -XPUT 'http://localhost:9200/giftsplusindex/giftsplusproduct/530/
price1' -d '{
"price1" : "19.53"
}'


(Clinton Gormley) #6

On Sat, 2011-08-13 at 15:55 +0200, David Pilato wrote:

I think you can easily handle it on your side.

  • ask ES to get your document ( Get /index/doc/1 )
  • Then modify your field
  • Then send back to ES the new version of the document
    (put /index/doc/1 )

To add to what David said, you can use ElasticSearch's versioning
feature to make sure that you don't overwrite any changes that have been
made while you are updating a document

clint


(Ridvan Gyundogan) #7

To be more concrete this is my use case, or the use case I expect to
have after short time:
I have 1 mln documents in elasticsearch and only in elasticsearch
because I do performance tests and they are more or less random.
Now for the new functionality in each document I want to add random
"sellPrice" field.

What I started to do is a code which takes all the documents out, adds
randomSell price to them and imports back again, but this does not
look very effective.
We have very often use cases where we add new fields for search.

I do not see how the versioning helps me in this case.

On Aug 13, 5:05 pm, Clinton Gormley cl...@traveljury.com wrote:

On Sat, 2011-08-13 at 15:55 +0200, David Pilato wrote:

I think you can easily handle it on your side.

  • ask ES to get your document ( Get /index/doc/1 )
  • Then modify your field
  • Then send back to ES the new version of the document
    (put /index/doc/1 )

To add to what David said, you can use ElasticSearch's versioning
feature to make sure that you don't overwrite any changes that have been
made while you are updating a document

clint


(Ivan Brusic) #8

I should have added that one possible reason for the feature not being
supported is that updates are not handled by Lucene at all. With Lucene, a
document needs to be deleted before any new version can be reindexed.
Handling updates might need to happen from the bottom (Lucene) up.

Parent/child documents + versions might help with updates, but I have never
used either.

--
Ivan

On Fri, Aug 12, 2011 at 10:00 AM, Ivan Brusic ivan@brusic.com wrote:

Ridvan,

It is not possible to increment only certain fields, you must reindex the
entire document.

--
Ivan

On Fri, Aug 12, 2011 at 9:02 AM, Ridvan Gyundogan ridvansg@gmail.comwrote:

Is it possible to update a field of a document without sending the
whole document again.
I try something like the following but it just replaces the whole
document with this ""price1 field only:
#curl -XPUT 'http://localhost:9200/giftsplusindex/giftsplusproduct/530/
price1 http://localhost:9200/giftsplusindex/giftsplusproduct/530/price1'
-d '{
"price1" : "19.53"
}'


(Andy-2) #9

I vote for this feature as well.

I have a "popularity" field that holds the number of user votes a
document has received. I use it to influence result ranking. It is
frequently updated. Right now every time a user votes on a document
I'd need to reindex the entire document which is obviously very
inefficient.

It'd be great to have a way to update certain fields without
reindexing the entire document. Solr has an ExternalFileField field
type for this purpose but it's not very user friendly.

Don't know if it's possible to implement such an "update certain field
without reindexing the whole document" feature in ES but if it's
possible it'd be very useful.

On Aug 13, 4:56 pm, Ridvan Gyundogan ridva...@gmail.com wrote:

To be more concrete this is my use case, or the use case I expect to
have after short time:
I have 1 mln documents in elasticsearch and only in elasticsearch
because I do performance tests and they are more or less random.
Now for the new functionality in each document I want to add random
"sellPrice" field.

What I started to do is a code which takes all the documents out, adds
randomSell price to them and imports back again, but this does not
look very effective.
We have very often use cases where we add new fields for search.

I do not see how the versioning helps me in this case.

On Aug 13, 5:05 pm, Clinton Gormley cl...@traveljury.com wrote:

On Sat, 2011-08-13 at 15:55 +0200, David Pilato wrote:

I think you can easily handle it on your side.

  • ask ES to get your document ( Get /index/doc/1 )
  • Then modify your field
  • Then send back to ES the new version of the document
    (put /index/doc/1 )

To add to what David said, you can use ElasticSearch's versioning
feature to make sure that you don't overwrite any changes that have been
made while you are updating a document

clint


(vineeth mohan) #10

I also vote for this feature.

Even if there is no support to update feature , deleting and adding it
again internally have lots of conviennce over the user doing it.
More over it can preserve atomicity in operation if its done internally.

Thanks
Vineeth

On Mon, Aug 15, 2011 at 10:08 AM, Andy selforganized@gmail.com wrote:

I vote for this feature as well.

I have a "popularity" field that holds the number of user votes a
document has received. I use it to influence result ranking. It is
frequently updated. Right now every time a user votes on a document
I'd need to reindex the entire document which is obviously very
inefficient.

It'd be great to have a way to update certain fields without
reindexing the entire document. Solr has an ExternalFileField field
type for this purpose but it's not very user friendly.

Don't know if it's possible to implement such an "update certain field
without reindexing the whole document" feature in ES but if it's
possible it'd be very useful.

On Aug 13, 4:56 pm, Ridvan Gyundogan ridva...@gmail.com wrote:

To be more concrete this is my use case, or the use case I expect to
have after short time:
I have 1 mln documents in elasticsearch and only in elasticsearch
because I do performance tests and they are more or less random.
Now for the new functionality in each document I want to add random
"sellPrice" field.

What I started to do is a code which takes all the documents out, adds
randomSell price to them and imports back again, but this does not
look very effective.
We have very often use cases where we add new fields for search.

I do not see how the versioning helps me in this case.

On Aug 13, 5:05 pm, Clinton Gormley cl...@traveljury.com wrote:

On Sat, 2011-08-13 at 15:55 +0200, David Pilato wrote:

I think you can easily handle it on your side.

  • ask ES to get your document ( Get /index/doc/1 )
  • Then modify your field
  • Then send back to ES the new version of the document
    (put /index/doc/1 )

To add to what David said, you can use ElasticSearch's versioning
feature to make sure that you don't overwrite any changes that have
been

made while you are updating a document

clint


(Sindre Sorhus) #11

I also vote for this. I see a lot of convenient use-cases for this feature.


(Alex Piggott) #12

I should probably just go look at the ES code, but just in case
anybody knows off the top of their head: does ES create a single
document object and then store a pointer to that object for each
indexed field?

If so, then presumably you could efficiently update non-indexed fields
within a document?

I actually have a use case where I would ideally like to store (but
not index) some frequently changing statistics and then use them in
scripts and facets that are part of my queries. The documents
themselves are pretty big, and the system insertion rate is reasonably
high, so re-indexing them each time the statistics change is
(probably) undesirable.

(Currently I do some unpleasant joint MongoDB/ElasticSearch
processing, but I'd like to do it all in ElasticSearch if possible)

Alex


(Otis Gospodnetić) #13

Andy,

In Solr land ExternalFileFile is designed for your use case (see
http://search-lucene.com/?q=ExternalFileField )
I think there is nothing like that in ES, but I'd love for somebody
to point I'm wrong about this! :slight_smile:

Otis

Sematext is hiring Search Engineers -- http://sematext.com/about/jobs.html

On Aug 15, 12:38 am, Andy selforgani...@gmail.com wrote:

I vote for this feature as well.

I have a "popularity" field that holds the number of user votes a
document has received. I use it to influence result ranking. It is
frequently updated. Right now every time a user votes on a document
I'd need to reindex the entire document which is obviously very
inefficient.

It'd be great to have a way to update certain fields without
reindexing the entire document. Solr has an ExternalFileField field
type for this purpose but it's not very user friendly.

Don't know if it's possible to implement such an "update certain field
without reindexing the whole document" feature in ES but if it's
possible it'd be very useful.

On Aug 13, 4:56 pm, Ridvan Gyundogan ridva...@gmail.com wrote:

To be more concrete this is my use case, or the use case I expect to
have after short time:
I have 1 mln documents in elasticsearch and only in elasticsearch
because I do performance tests and they are more or less random.
Now for the new functionality in each document I want to add random
"sellPrice" field.

What I started to do is a code which takes all the documents out, adds
randomSell price to them and imports back again, but this does not
look very effective.
We have very often use cases where we add new fields for search.

I do not see how the versioning helps me in this case.

On Aug 13, 5:05 pm, Clinton Gormley cl...@traveljury.com wrote:

On Sat, 2011-08-13 at 15:55 +0200, David Pilato wrote:

I think you can easily handle it on your side.

  • ask ES to get your document ( Get /index/doc/1 )
  • Then modify your field
  • Then send back to ES the new version of the document
    (put /index/doc/1 )

To add to what David said, you can use ElasticSearch's versioning
feature to make sure that you don't overwrite any changes that have been
made while you are updating a document

clint


(Shay Banon) #14

Otis, are you referring to this:
http://lucene.apache.org/solr/api/org/apache/solr/schema/ExternalFileField.html?
And you think its the same..., really? Are you sure you understand what it
means to provide updatable fields, and then taking them to a distributed
system? What I would love is to really think about "comprable" "features"
before throwing them out here (similar to the "update processor" suggestion
for notifications), with or without smilies.

On Tue, Aug 16, 2011 at 3:19 AM, Otis Gospodnetic <
otis.gospodnetic@gmail.com> wrote:

Andy,

In Solr land ExternalFileFile is designed for your use case (see
http://search-lucene.com/?q=ExternalFileField )
I think there is nothing like that in ES, but I'd love for somebody
to point I'm wrong about this! :slight_smile:

Otis

Sematext is hiring Search Engineers -- http://sematext.com/about/jobs.html

On Aug 15, 12:38 am, Andy selforgani...@gmail.com wrote:

I vote for this feature as well.

I have a "popularity" field that holds the number of user votes a
document has received. I use it to influence result ranking. It is
frequently updated. Right now every time a user votes on a document
I'd need to reindex the entire document which is obviously very
inefficient.

It'd be great to have a way to update certain fields without
reindexing the entire document. Solr has an ExternalFileField field
type for this purpose but it's not very user friendly.

Don't know if it's possible to implement such an "update certain field
without reindexing the whole document" feature in ES but if it's
possible it'd be very useful.

On Aug 13, 4:56 pm, Ridvan Gyundogan ridva...@gmail.com wrote:

To be more concrete this is my use case, or the use case I expect to
have after short time:
I have 1 mln documents in elasticsearch and only in elasticsearch
because I do performance tests and they are more or less random.
Now for the new functionality in each document I want to add random
"sellPrice" field.

What I started to do is a code which takes all the documents out, adds
randomSell price to them and imports back again, but this does not
look very effective.
We have very often use cases where we add new fields for search.

I do not see how the versioning helps me in this case.

On Aug 13, 5:05 pm, Clinton Gormley cl...@traveljury.com wrote:

On Sat, 2011-08-13 at 15:55 +0200, David Pilato wrote:

I think you can easily handle it on your side.

  • ask ES to get your document ( Get /index/doc/1 )
  • Then modify your field
  • Then send back to ES the new version of the document
    (put /index/doc/1 )

To add to what David said, you can use ElasticSearch's versioning
feature to make sure that you don't overwrite any changes that have
been

made while you are updating a document

clint


(Otis Gospodnetić) #15

Hi Shay,

Sorry for the misunderstanding - I addressed Andy directly, referring
to his specific use case (updates of a popularity field), for which
EFF is indeed a good choice.
I did not (mean to) say that EFF is the same as updates of single
fields. I even know JIRA issue number for that functionality in Solr
by heart and have known it for the last few years, so I understand
what that particular feature is about. :slight_smile:

But I am curious about EFF-like functionality in ES, so I'll just
start another thread about it in order not to hijack this one.

Otis
P.S.
URP in Solr and notifications does work - we've implemented this a few
times before.

On Aug 15, 9:08 pm, Shay Banon kim...@gmail.com wrote:

Otis, are you referring to this:http://lucene.apache.org/solr/api/org/apache/solr/schema/ExternalFile...
And you think its the same..., really? Are you sure you understand what it
means to provide updatable fields, and then taking them to a distributed
system? What I would love is to really think about "comprable" "features"
before throwing them out here (similar to the "update processor" suggestion
for notifications), with or without smilies.

On Tue, Aug 16, 2011 at 3:19 AM, Otis Gospodnetic <

otis.gospodne...@gmail.com> wrote:

Andy,

In Solr land ExternalFileFile is designed for your use case (see
http://search-lucene.com/?q=ExternalFileField)
I think there is nothing like that in ES, but I'd love for somebody
to point I'm wrong about this! :slight_smile:

Otis

Sematext is hiring Search Engineers --http://sematext.com/about/jobs.html

On Aug 15, 12:38 am, Andy selforgani...@gmail.com wrote:

I vote for this feature as well.

I have a "popularity" field that holds the number of user votes a
document has received. I use it to influence result ranking. It is
frequently updated. Right now every time a user votes on a document
I'd need to reindex the entire document which is obviously very
inefficient.

It'd be great to have a way to update certain fields without
reindexing the entire document. Solr has an ExternalFileField field
type for this purpose but it's not very user friendly.

Don't know if it's possible to implement such an "update certain field
without reindexing the whole document" feature in ES but if it's
possible it'd be very useful.

On Aug 13, 4:56 pm, Ridvan Gyundogan ridva...@gmail.com wrote:

To be more concrete this is my use case, or the use case I expect to
have after short time:
I have 1 mln documents in elasticsearch and only in elasticsearch
because I do performance tests and they are more or less random.
Now for the new functionality in each document I want to add random
"sellPrice" field.

What I started to do is a code which takes all the documents out, adds
randomSell price to them and imports back again, but this does not
look very effective.
We have very often use cases where we add new fields for search.

I do not see how the versioning helps me in this case.

On Aug 13, 5:05 pm, Clinton Gormley cl...@traveljury.com wrote:

On Sat, 2011-08-13 at 15:55 +0200, David Pilato wrote:

I think you can easily handle it on your side.

  • ask ES to get your document ( Get /index/doc/1 )
  • Then modify your field
  • Then send back to ES the new version of the document
    (put /index/doc/1 )

To add to what David said, you can use ElasticSearch's versioning
feature to make sure that you don't overwrite any changes that have
been

made while you are updating a document

clint


(Vladimir Kartaviy) #16

What about PartialUpdate plugin?

https://github.com/medcl/ElasticSearch.PartialUpdate


(plaflamme) #17

That's precisely what I'd need to support my use case. I'd be nice to see
this as a core feature within ES.

The use case is that sometimes a client may only know about an incremental
change to a document. Requiring that the client fetches the whole document,
merge the change and send it for indexing is very expensive, especially in
a bulk situation.

In my case, the client knows that a field has (potentially) changed in ALL
documents. I'd like to be able to send the new value for every document as
a single (or several) bulk request(s).

There seems to be an issue for this feature request:

Philippe

On Wed, Nov 16, 2011 at 09:08, Vladimir Kartaviy vkartaviy@gmail.comwrote:

What about PartialUpdate plugin?

https://github.com/medcl/ElasticSearch.PartialUpdate


(Shay Banon) #18

The way to solve it now is to either query ES and reindex, or get a
document and reindex.

On Mon, Nov 28, 2011 at 5:39 PM, Philippe Laflamme <
philippe.laflamme@obiba.org> wrote:

That's precisely what I'd need to support my use case. I'd be nice to see
this as a core feature within ES.

The use case is that sometimes a client may only know about an incremental
change to a document. Requiring that the client fetches the whole document,
merge the change and send it for indexing is very expensive, especially in
a bulk situation.

In my case, the client knows that a field has (potentially) changed in ALL
documents. I'd like to be able to send the new value for every document as
a single (or several) bulk request(s).

There seems to be an issue for this feature request:
https://github.com/elasticsearch/elasticsearch/issues/426

Philippe

On Wed, Nov 16, 2011 at 09:08, Vladimir Kartaviy vkartaviy@gmail.comwrote:

What about PartialUpdate plugin?

https://github.com/medcl/ElasticSearch.PartialUpdate


(plaflamme) #19

Yes, I understand. But the request is that this fetch/merge/re-index
process be done by the server/cluster instead of the client. The client
would send the change and the cluster would take care of doing the dirty
work.

Wouldn't doing this save bandwidth (and so be faster)? Seems heavy to have
to transfer the whole document twice (between client and cluster) instead
of only sending the partial update.

Thanks,
Philippe

On Tue, Nov 29, 2011 at 11:59, Shay Banon kimchy@gmail.com wrote:

The way to solve it now is to either query ES and reindex, or get a
document and reindex.

On Mon, Nov 28, 2011 at 5:39 PM, Philippe Laflamme <
philippe.laflamme@obiba.org> wrote:

That's precisely what I'd need to support my use case. I'd be nice to see
this as a core feature within ES.

The use case is that sometimes a client may only know about an
incremental change to a document. Requiring that the client fetches the
whole document, merge the change and send it for indexing is very
expensive, especially in a bulk situation.

In my case, the client knows that a field has (potentially) changed in
ALL documents. I'd like to be able to send the new value for every document
as a single (or several) bulk request(s).

There seems to be an issue for this feature request:
https://github.com/elasticsearch/elasticsearch/issues/426

Philippe

On Wed, Nov 16, 2011 at 09:08, Vladimir Kartaviy vkartaviy@gmail.comwrote:

What about PartialUpdate plugin?

https://github.com/medcl/ElasticSearch.PartialUpdate


(Shay Banon) #20

It will be simpler, will save a bit on bandwidth (though its a premature
optimization that you are doing right there). Now, you need to do it.

On Tue, Nov 29, 2011 at 11:37 PM, Philippe Laflamme <
philippe.laflamme@obiba.org> wrote:

Yes, I understand. But the request is that this fetch/merge/re-index
process be done by the server/cluster instead of the client. The client
would send the change and the cluster would take care of doing the dirty
work.

Wouldn't doing this save bandwidth (and so be faster)? Seems heavy to have
to transfer the whole document twice (between client and cluster) instead
of only sending the partial update.

Thanks,
Philippe

On Tue, Nov 29, 2011 at 11:59, Shay Banon kimchy@gmail.com wrote:

The way to solve it now is to either query ES and reindex, or get a
document and reindex.

On Mon, Nov 28, 2011 at 5:39 PM, Philippe Laflamme <
philippe.laflamme@obiba.org> wrote:

That's precisely what I'd need to support my use case. I'd be nice to
see this as a core feature within ES.

The use case is that sometimes a client may only know about an
incremental change to a document. Requiring that the client fetches the
whole document, merge the change and send it for indexing is very
expensive, especially in a bulk situation.

In my case, the client knows that a field has (potentially) changed in
ALL documents. I'd like to be able to send the new value for every document
as a single (or several) bulk request(s).

There seems to be an issue for this feature request:
https://github.com/elasticsearch/elasticsearch/issues/426

Philippe

On Wed, Nov 16, 2011 at 09:08, Vladimir Kartaviy vkartaviy@gmail.comwrote:

What about PartialUpdate plugin?

https://github.com/medcl/ElasticSearch.PartialUpdate