Frequently updated int field

Andy_2 · June 7, 2011, 3:12am

I have an integer field "popularity" that is frequently updated. The
"popularity" of a document can either be increased or decreased. I use
the value of that field to help rank my search results.

A search engine might not be designed for frequently updated fields
like that. Any tips on how best to handle that in ElasticSearch?

Thanks.

Mahendra_M · June 7, 2011, 4:42am

Hi Andy,

On Tue, Jun 7, 2011 at 8:42 AM, Andy selforganized@gmail.com wrote:

I have an integer field "popularity" that is frequently updated. The
"popularity" of a document can either be increased or decreased. I use
the value of that field to help rank my search results.

Even I have the same use case. A "popularity" field being updated
based on usage of a document.

A search engine might not be designed for frequently updated fields
like that. Any tips on how best to handle that in Elasticsearch?

How frequent are your updates ? Elasticsearch, I think, can handle
frequent updates pretty well.
Do have a look at this link -
http://engineering.socialcast.com/2011/05/realtime-search-solr-vs-elasticsearch/

Even so, we use some other tricks to reduce the frequency of updates.
Instead of updating Elasticsearch frequently, we collect the usage of
a document over a period of time (say 10 minutes) aggregate the result
and then update it to Elasticsearch. Maybe you can look at a similar
approach.

Regards,
Mahendra

http://twitter.com/mahendra

Andy_2 · June 7, 2011, 5:56pm

Thanks Mahendra.

How do you implement your batched updates? Do you fire up a cron job
every x minutes to get the "popularity" values from a database and
then use it to update Elasticsearch?

On Jun 7, 12:42 am, Mahendra M mahendr...@gmail.com wrote:

Hi Andy,

On Tue, Jun 7, 2011 at 8:42 AM, Andy selforgani...@gmail.com wrote:

I have an integer field "popularity" that is frequently updated. The
"popularity" of a document can either be increased or decreased. I use
the value of that field to help rank my search results.

Even I have the same use case. A "popularity" field being updated
based on usage of a document.

A search engine might not be designed for frequently updated fields
like that. Any tips on how best to handle that in Elasticsearch?

How frequent are your updates ? Elasticsearch, I think, can handle
frequent updates pretty well.
Do have a look at this link -http://engineering.socialcast.com/2011/05/realtime-search-solr-vs-ela...

Even so, we use some other tricks to reduce the frequency of updates.
Instead of updating Elasticsearch frequently, we collect the usage of
a document over a period of time (say 10 minutes) aggregate the result
and then update it to Elasticsearch. Maybe you can look at a similar
approach.

Regards,
Mahendra

http://twitter.com/mahendra

ppearcy · June 7, 2011, 7:19pm

Hey Andy/Mahendra,
We do the exact same things that Mahendra mentions. We have a custom
data processing tool we use instead of cron, but it runs in a very
similar fashion. It works off of relative values, though, where we get
the number of document requests since the last run and only update
documents that have changed. On top of doing these updates in batch at
certain intervals, we are also considering ignoring documents with
only a request or two.

We had hoped that parent/child documents would allow us to do this
more efficiently, but parent documents cannot be sorted by values in
child documents.

We haven't yet launched anything using this, but don't expect any
issues.

I had seen some interesting discussions around this at the Lucene
level, but don't believe any of it pertains to ES:
http://www.lucenerevolution.org/blog/2011/05/31/224/
http://www.mjohnston.com/2009/09/adding-external-datasources-to-lucene-scoring/
(A little older, so not sure if still relvant)

Thanks,
Paul

On Jun 7, 11:56 am, Andy selforgani...@gmail.com wrote:

Thanks Mahendra.

How do you implement your batched updates? Do you fire up a cron job
every x minutes to get the "popularity" values from a database and
then use it to update Elasticsearch?

On Jun 7, 12:42 am, Mahendra M mahendr...@gmail.com wrote:

Hi Andy,

On Tue, Jun 7, 2011 at 8:42 AM, Andy selforgani...@gmail.com wrote:

I have an integer field "popularity" that is frequently updated. The
"popularity" of a document can either be increased or decreased. I use
the value of that field to help rank my search results.

Even I have the same use case. A "popularity" field being updated
based on usage of a document.

A search engine might not be designed for frequently updated fields
like that. Any tips on how best to handle that in Elasticsearch?

How frequent are your updates ? Elasticsearch, I think, can handle
frequent updates pretty well.
Do have a look at this link -http://engineering.socialcast.com/2011/05/realtime-search-solr-vs-ela...

Even so, we use some other tricks to reduce the frequency of updates.
Instead of updating Elasticsearch frequently, we collect the usage of
a document over a period of time (say 10 minutes) aggregate the result
and then update it to Elasticsearch. Maybe you can look at a similar
approach.

Regards,
Mahendra

http://twitter.com/mahendra

fashionalwallet · June 10, 2011, 12:30am

deleted -

Topic		Replies	Views
How to update integer like click count efficiently for ranking Elasticsearch	2	1431	July 5, 2017
Frequently updating index entries Elasticsearch	4	1080	July 6, 2017
Boosting results based on another field? Elasticsearch	3	317	July 6, 2017
How to manage data with high update rate? Elasticsearch	5	821	February 21, 2021
Boosting results based on another field? Elasticsearch	1	263	July 6, 2017

Frequently updated int field

Related topics