Contains Method Not Working in Sort Script

We have an old script in sort that seems to be no longer working and I can't quite figure out the case (other than we upgraded from Elastic 2 to 6), we have something like:

"sort": [
{
  "_script": {
    "order": "desc",
    "script": {
      "params": {
        "id": 12345
      },
      "source": "(doc['listedByIds'].size() > 0 && doc['listedByIds'].value == params.id) || doc['listedByIds'].contains(params.id) ? 1 : 0"
    },
    "type": "number"
  }
},
...other sort stuff
]

Basically, we have an array listedByIds that contains the ids of users that has listed this listing, and we want to sort a listing made by id 12345 on top before others.

Was not sure why contains does not work in this scenario anymore (tried it standalone), we reviewed the painless documentation, but doing a simple array access and comparing ids has proven to be a bit slow.

Nevermind, looks like it could've not worked before, but now works with:

doc['listedByIds'].size() > 0 && doc['listedByIds'].value == params.id) || doc['listedByIds'].contains((long) params.id) ? 1 : 0

Could anyone from the Elastic team comment on the performance of this sort? We're considering rewriting it so we don't do a complex sort like this.

Script sorting means, that for every hit a script needs to be executed. If you have one million hits, you end up with one million executions. If there is any chance to prevent this, I suppose your queries would become much faster.

Maybe you can explain (without any technical terms), what you are after using this sort. It looks if your sort result is basically (true|false), with true being first.

Do you need all the results, or would a filter potentially work as well? What is the intention of this scoring?

Hi Alex,

Basically, we would like to return listing results (generally up to 300k results), with sorting the results from a specific id to the top and then return the others.

We would potentially need all the results as we handle pagination somewhere downstream. My initial thought was just to run two queries, looking for listings with the id and without and just combine them.

I inherited this piece of code and I've been questioning if it has ever worked since we have alerts when Elastic is taking X ms to respond and using X amount of CPU.

ah, so you basically try to always score certain documents first. The next minor release of the elastic stack will have a new feature called pinned query, which allows you to do exactly that. Take a look at https://www.elastic.co/guide/en/elasticsearch/reference/7.4/query-dsl-pinned-query.html

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.