Contains Method Not Working in Sort Script

djue · August 28, 2019, 11:47pm

We have an old script in sort that seems to be no longer working and I can't quite figure out the case (other than we upgraded from Elastic 2 to 6), we have something like:

"sort": [
{
  "_script": {
    "order": "desc",
    "script": {
      "params": {
        "id": 12345
      },
      "source": "(doc['listedByIds'].size() > 0 && doc['listedByIds'].value == params.id) || doc['listedByIds'].contains(params.id) ? 1 : 0"
    },
    "type": "number"
  }
},
...other sort stuff
]

Basically, we have an array listedByIds that contains the ids of users that has listed this listing, and we want to sort a listing made by id 12345 on top before others.

Was not sure why contains does not work in this scenario anymore (tried it standalone), we reviewed the painless documentation, but doing a simple array access and comparing ids has proven to be a bit slow.

djue · August 29, 2019, 1:13am

Nevermind, looks like it could've not worked before, but now works with:

doc['listedByIds'].size() > 0 && doc['listedByIds'].value == params.id) || doc['listedByIds'].contains((long) params.id) ? 1 : 0

Could anyone from the Elastic team comment on the performance of this sort? We're considering rewriting it so we don't do a complex sort like this.

spinscale · August 29, 2019, 7:34am

Script sorting means, that for every hit a script needs to be executed. If you have one million hits, you end up with one million executions. If there is any chance to prevent this, I suppose your queries would become much faster.

Maybe you can explain (without any technical terms), what you are after using this sort. It looks if your sort result is basically (true|false), with true being first.

Do you need all the results, or would a filter potentially work as well? What is the intention of this scoring?

djue · August 29, 2019, 5:25pm

Hi Alex,

Basically, we would like to return listing results (generally up to 300k results), with sorting the results from a specific id to the top and then return the others.

We would potentially need all the results as we handle pagination somewhere downstream. My initial thought was just to run two queries, looking for listings with the id and without and just combine them.

I inherited this piece of code and I've been questioning if it has ever worked since we have alerts when Elastic is taking X ms to respond and using X amount of CPU.

spinscale · August 30, 2019, 8:14am

ah, so you basically try to always score certain documents first. The next minor release of the elastic stack will have a new feature called pinned query, which allows you to do exactly that. Take a look at https://www.elastic.co/guide/en/elasticsearch/reference/7.4/query-dsl-pinned-query.html

system · September 27, 2019, 8:17am

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Why can i do an "if" in my script? Elasticsearch	2	438	July 6, 2017
Check value exists in params array using ElasticSearch painless script Elasticsearch	4	11675	December 28, 2018
I try to add a custom script field which is a returns true if the document's id exists Elasticsearch	3	307	July 6, 2017
Querying and sorting using painless script and nested fields Elasticsearch	3	4983	October 20, 2020
Question regarding Sorts Elasticsearch	11	401	August 23, 2018

Contains Method Not Working in Sort Script

Related topics