Select and Update matching docs

han1 · May 7, 2019, 5:24pm

Hi.
We are trying to do the following and any help would be appreciated.
Say you make a search and 100,000 documents match.
We would like to increment a counter in each document that matched. Then at the same time select the first page say the first 50.

Can this be done in one operation or may be a parallel scenario.

gabriel_tessier · May 8, 2019, 3:59am

As I understand your problem you can solve it by doing 2 queries.

base_query={"query":{"whateverfilter"}}
query = base_query + {"size": 50, "from": page} # here you add your limit as you want only 50 result on page 1
result = query.run() # you run your query that you'll display result on your first page 50 doc

keeping the same base query that you use for your search and apply it to update all your docs.

update_query={"script": {
"source": "ctx._source.match_docs_increment++",
"lang": "painless"
},
"query": {base_query}
update_query.run() # run the update

If you have few traffic on search this one can be ok but it will add load on your server as you'll update your documents each time you make a search.

Is it to set a weight on your document to sort on the most popular?
If it's just for statistics you can dump the result of your request in a file and set a filebeat to send them in elastic in a different index or server... Depends on what you want to do with.

Hope it help.

han1 · May 8, 2019, 4:45am

Many thanks for your response.
This is in fact for statistics purposes.
Running 2 queries will indeed be problematic.
Any chance you could help us achieve it using the most efficient method. Of course we will pay for your consultancy service

gabriel_tessier · May 8, 2019, 12:03pm

I can help but can't provide you a service as consultant.

Which language, framework are you using?

My solution is pretty simple, just send the content to a log file it can be done in 2~3 lines of code, maybe less depend on your framework, then you can use filebeat to parse the logs and store them in elastic. I think this solution don't need deep technical skill as filebeat is really easy to use.

https://www.elastic.co/guide/en/beats/filebeat/current/index.html

han1 · May 8, 2019, 12:30pm

Hi,
Thanks for you response.
We are using the NEST library in an MVC Core application.

Unfortunately we have never used filebeat and we may need more time than a couple of lines of code for someone who has more experience. Is there any one you can recommend who assist us achieve this task?

Many thanks

gabriel_tessier · May 10, 2019, 12:26am

You can try on stackoverflow, there's certainly some people with .net and nest skill that can help.

All the best.

forloop · May 11, 2019, 12:28am

I can't think of a way in which you can efficiently do this in one operation* (well, one request).

Update by query API would be the most logical way to increment a counter on 100,000 documents, however it does not return the ids of documents that were updated, so you wouldn't be able to collect the first 50 documents and return those.

I think two requests, a search query that returns the first 50 documents, and an update by query to increment counters executed at the same time would be the straightforward way to approach this.

han1 · May 11, 2019, 1:56am

Hi
Thanks for the advice.

Any chance of a very simple query using NEST so that we get it right.

Would greatly appreciate it.

Kindest Regards

system · June 8, 2019, 1:59am

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Select and Update Elasticsearch	2	301	June 5, 2019
Update by query performance question Elasticsearch	1	324	August 18, 2020
How can I update a certain number of documents Elasticsearch	5	478	June 18, 2019
How to increase elasticsearch update and skip high load? Elasticsearch	1	285	October 3, 2022
Not possible to count documents in search query Elasticsearch	6	326	December 24, 2018

Select and Update matching docs

Related topics