The performance of Script Based Sorting in ES 7.4.2 verison is very poor

Hello, all,
Recently , We compare the query performance of two versions of ES cluster. One is 5.3.2, and the other is 7.4.2.
The index settings, mapping and the amout of data , is exactly the same.I find that the query performance of script sort on 7.42 is very poor.
From a single query request, the performance of script sort is particularly poor in version 7.4.2,much worse than that in version 5.3.2.
For the same two requests, there is a big difference in response time, as shown in the figure below.
We can share the changes have taken place in the underlying structure of script sort, and how do we optimize the performance of this sort.
Note: The only difference is that the content of the script is different. Due to different versions, there are some slight differences, but others are exactly the same.
eg: doc['departure_city_ids'].values is change to doc['departure_city_ids'].

The sort script in 7.4.2 is below. The ES took time is about 165ms.

    "sort": [
        {
            "_script": {
                "script": {
                    "source": "boolean hasDepartureCity(org.elasticsearch.index.fielddata.ScriptDocValues.Strings values,String city){int low = 0;int high = values.size() - 1;while (low <= high){int mid=(low + high) >>> 1;String midVal = values.get(mid);int cmp = midVal.compareTo(city);if (cmp < 0){low = mid + 1;}else if (cmp > 0){high = mid - 1;}else{return true;}}return false;}long getScore(org.elasticsearch.index.fielddata.ScriptDocValues.Longs scores,long lowScore,long highScore){int low = 0;int high = scores.size() - 1;while (low <= high){int mid=(low + high) >>> 1;long midVal = scores.get(mid).longValue();if (midVal < lowScore){low = mid + 1;}else if (midVal > highScore){high = mid - 1;}else{return scores.get(mid).longValue()-lowScore;}}return 1L;}double total = _score;ArrayList sortedCities = params.sortedCities;HashMap cities = params.cities; String city = '';if (doc.containsKey('departure_city_ids')){for (i in sortedCities){        if(hasDepartureCity(doc['departure_city_ids'], i)){city = i; break;}}    if (city == '' && doc['departure_city_ids'].length > 0){        city =doc['departure_city_ids'][0];}}String  bk = params.tab + city;double hiveScore = 0;if (doc['sh_score_set'].size() == 0) return 0;if (city!='' && doc.containsKey('sh_score_set') && doc['sh_score_set'].value>0){long lowScore = (Long.valueOf(city,10).longValue()<<43);long highScore = lowScore + 100000;    hiveScore += getScore(doc['sh_score_set'],lowScore,highScore);}total += (hiveScore* (cities.containsKey(city) ? cities[city] : 1));return total;",
                    "lang": "painless",
                    "params": {
                        "sortedCities": [
                            "2",
                            "13",
                            "17",
                            "213",
                            "86",
                            "375"
                        ],
                        "cities": {
                            "2": 1000,
                            "13": 100,
                            "17": 50,
                            "86": 5,
                            "213": 10,
                            "375": 5
                        },
                        "departureCityPoi": 2,
                        "tab": "default_scores.",
                        "destCity": "0",
                        "boostOfTabVip": 0
                    }
                },
                "type": "number",
                "order": "desc"
            }
        },
        {
            "_score": {
                "order": "desc"
            }
        }
    ],

The sort script in 5.3.2 is below. The ES took time is about 29ms.

"sort": [
        {
            "_script": {
                "script": {
                    "inline": "boolean hasDepartureCity(org.elasticsearch.index.fielddata.ScriptDocValues.Strings values,String city){int low = 0;int high = values.size() - 1;while (low <= high){int mid=(low + high) >>> 1;String midVal = values.get(mid);int cmp = midVal.compareTo(city);if (cmp < 0){low = mid + 1;}else if (cmp > 0){high = mid - 1;}else{return true;}}return false;}long getScore(org.elasticsearch.index.fielddata.ScriptDocValues.Longs scores,long lowScore,long highScore){int low = 0;int high = scores.size() - 1;while (low <= high){int mid=(low + high) >>> 1;long midVal = scores.get(mid).longValue();if (midVal < lowScore){low = mid + 1;}else if (midVal > highScore){high = mid - 1;}else{return scores.get(mid).longValue()-lowScore;}}return 1L;}double total = _score;ArrayList sortedCities = params.sortedCities;HashMap cities = params.cities; String city = '';if (doc.containsKey('departure_city_ids')){for (i in sortedCities){        if(hasDepartureCity(doc['departure_city_ids'].values, i)){city = i; break;}}    if (city == '' && doc['departure_city_ids'].values.length > 0){        city =doc['departure_city_ids'].values[0];}}String  bk = params.tab + city;double hiveScore = 0;if (city!='' && doc.containsKey('sh_score_set') && doc['sh_score_set'].value>0){long lowScore = (Long.valueOf(city,10).longValue()<<43);long highScore = lowScore + 100000;    hiveScore += getScore(doc['sh_score_set'].values,lowScore,highScore);}total += (hiveScore* (cities.containsKey(city) ? cities[city] : 1));return total;",
                    "lang": "painless",
                    "params": {
                        "sortedCities": [
                            "2",
                            "13",
                            "17",
                            "213",
                            "86",
                            "375"
                        ],
                        "cities": {
                            "2": 1000,
                            "13": 100,
                            "17": 50,
                            "86": 5,
                            "213": 10,
                            "375": 5
                        },
                        "departureCityPoi": 2,
                        "tab": "default_scores.",
                        "destCity": "0",
                        "boostOfTabVip": 0
                    }
                },
                "type": "number",
                "order": "desc"
            }
        },
        {
            "_score": {
                "order": "desc"
            }
        }
    ],

I also post images of text, the picture is as follows , this is the process I have tested


Can you provide a set of commands to replicate this, eg sample docs, mappings, etc?

hi , warkolm,
nice to meet you,
https://discuss.elastic.co/t/the-performance-of-script-based-sorting-in-es-7-4-2-verison-is-very-poor/261241
The index name istour-nt-2-new, the settings is as below in ES 7.4.2

{
  "tour-nt-2-new": {
    "settings": {
      "index": {
        "mapping": {
          "total_fields": {
            "limit": "20000"
          }
        },
        "refresh_interval": "60s",
        "indexing": {
          "slowlog": {
            "source": "10000"
          }
        },
        "translog": {
          "sync_interval": "60s",
          "durability": "async"
        },
        "provided_name": "tour-nt-2-new",
        "creation_date": "1610705080224",
        "store": {
          "preload": [
            "nvd",
            "dvd",
            "tim",
            "doc",
            "dim"
          ]
        },
        "unassigned": {
          "node_left": {
            "delayed_timeout": "60m"
          }
        },
        "analysis": {
          "filter": {
            "custom_length_filter": {
              "type": "length",
              "min": "2"
            }
          },
          "analyzer": {
            "default": {
              "type": "ik_max_word"
            },
            "ik_custom": {
              "filter": [
                "custom_length_filter"
              ],
              "tokenizer": "ik_max_word"
            }
          },
          "search_analyzer": {
            "default": {
              "type": "ik_smart"
            }
          }
        },
        "number_of_replicas": "1",
        "uuid": "bCHptGI7Ry2K1IcqHYj-Hg",
        "version": {
          "created": "7040299"
        },
        "search": {
          "slowlog": {
            "threshold": {
              "query": {
                "warn": "500ms"
              }
            }
          }
        },
        "number_of_shards": "4"
      }
    }
  }
}

The query statement is as follows:

{
    "from": 0,
    "size": 5,
    "timeout": "500ms",
    "query": {
        "bool": {
            "must": [
                {
                    "constant_score": {
                        "filter": {
                            "terms": {
                                "departure_city_ids": [
                                    17,
                                    2,
                                    213,
                                    86,
                                    375,
                                    13
                                ],
                                "boost": 1
                            }
                        },
                        "boost": 1
                    }
                },
                {
                    "constant_score": {
                        "filter": {
                            "term": {
                                "poids": {
                                    "value": 61,
                                    "boost": 1
                                }
                            }
                        },
                        "boost": 33554432
                    }
                }
            ],
            "filter": [
                {
                    "term": {
                        "available": {
                            "value": true,
                            "boost": 1
                        }
                    }
                },
                {
                    "term": {
                        "distribution_channels": {
                            "value": "1",
                            "boost": 1
                        }
                    }
                },
                {
                    "term": {
                        "locale_state": {
                            "value": true,
                            "boost": 1
                        }
                    }
                },
                {
                    "terms": {
                        "sale_channels": [
                            0,
                            5,
                            7
                        ],
                        "boost": 1
                    }
                },
                {
                    "term": {
                        "tabs": {
                            "value": 64,
                            "boost": 1
                        }
                    }
                }
            ],
            "should": [
                {
                    "constant_score": {
                        "filter": {
                            "term": {
                                "id": {
                                    "value": 28394659,
                                    "boost": 1
                                }
                            }
                        },
                        "boost": 103079215000
                    }
                },
                {
                    "constant_score": {
                        "filter": {
                            "term": {
                                "id": {
                                    "value": 27897206,
                                    "boost": 1
                                }
                            }
                        },
                        "boost": 68719477000
                    }
                },
                {
                    "constant_score": {
                        "filter": {
                            "term": {
                                "id": {
                                    "value": 27829811,
                                    "boost": 1
                                }
                            }
                        },
                        "boost": 34359738000
                    }
                }
            ],
            "adjust_pure_negative": true,
            "boost": 1
        }
    },
    "_source": {
        "includes": [
            "id",
            "name",
            "package_name",
            "type",
            "level",
            "provider_brand.id",
            "sale_mode",
            "comment.count",
            "comment.score",
            "theme_tag_id",
            "line_flag",
            "package_internal_order",
            "departure_city_ids"
        ],
        "excludes": []
    },
    "sort": [
        {
            "_script": {
                "script": {
                    "source": "boolean hasDepartureCity(org.elasticsearch.index.fielddata.ScriptDocValues.Strings values,String city){int low = 0;int high = values.size() - 1;while (low <= high){int mid=(low + high) >>> 1;String midVal = values.get(mid);int cmp = midVal.compareTo(city);if (cmp < 0){low = mid + 1;}else if (cmp > 0){high = mid - 1;}else{return true;}}return false;}long getScore(org.elasticsearch.index.fielddata.ScriptDocValues.Longs scores,long lowScore,long highScore){int low = 0;int high = scores.size() - 1;while (low <= high){int mid=(low + high) >>> 1;long midVal = scores.get(mid).longValue();if (midVal < lowScore){low = mid + 1;}else if (midVal > highScore){high = mid - 1;}else{return scores.get(mid).longValue()-lowScore;}}return 1L;}double total = _score;ArrayList sortedCities = params.sortedCities;HashMap cities = params.cities; String city = '';if (doc.containsKey('departure_city_ids')){for (i in sortedCities){        if(hasDepartureCity(doc['departure_city_ids'], i)){city = i; break;}}    if (city == '' && doc['departure_city_ids'].length > 0){        city =doc['departure_city_ids'][0];}}String  bk = params.tab + city;double hiveScore = 0;if (doc['sh_score_set'].size() == 0) return 0;if (city!='' && doc.containsKey('sh_score_set') && doc['sh_score_set'].value>0){long lowScore = (Long.valueOf(city,10).longValue()<<43);long highScore = lowScore + 100000;    hiveScore += getScore(doc['sh_score_set'],lowScore,highScore);}total += (hiveScore* (cities.containsKey(city) ? cities[city] : 1));return total;",
                    "lang": "painless",
                    "params": {
                        "sortedCities": [
                            "2",
                            "13",
                            "17",
                            "213",
                            "86",
                            "375"
                        ],
                        "cities": {
                            "2": 1000,
                            "13": 100,
                            "17": 50,
                            "86": 5,
                            "213": 10,
                            "375": 5
                        },
                        "departureCityPoi": 2,
                        "tab": "default_scores.",
                        "destCity": "0",
                        "boostOfTabVip": 0
                    }
                },
                "type": "number",
                "order": "desc"
            }
        },
        {
            "_score": {
                "order": "desc"
            }
        }
    ],
    "track_scores": true,
    "aggregations": {
        "totalCount": {
            "cardinality": {
                "field": "group",
                "precision_threshold": 100
            }
        },
        "festivals_tomorrow": {
            "terms": {
                "field": "festivals.22.keyword",
                "size": 200,
                "min_doc_count": 1,
                "shard_min_doc_count": 0,
                "show_term_doc_count_error": false,
                "order": [
                    {
                        "_count": "desc"
                    },
                    {
                        "_key": "asc"
                    }
                ]
            }
        },
        "festivals_today": {
            "terms": {
                "field": "festivals.21.keyword",
                "size": 200,
                "min_doc_count": 1,
                "shard_min_doc_count": 0,
                "show_term_doc_count_error": false,
                "order": [
                    {
                        "_count": "desc"
                    },
                    {
                        "_key": "asc"
                    }
                ]
            }
        }
    },
    "collapse": {
        "field": "group"
    }
}

Due to the complexity of the statement, the response time of the query is relatively long, and the performance is not very good. Do you have an optimized solution ?

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.