Hello, all,
Recently , We compare the query performance of two versions of ES cluster. One is 5.3.2, and the other is 7.4.2.
The index settings, mapping and the amout of data , is exactly the same.I find that the query performance of script sort on 7.42 is very poor.
From a single query request, the performance of script sort is particularly poor in version 7.4.2,much worse than that in version 5.3.2.
For the same two requests, there is a big difference in response time, as shown in the figure below.
We can share the changes have taken place in the underlying structure of script sort, and how do we optimize the performance of this sort.
Note: The only difference is that the content of the script is different. Due to different versions, there are some slight differences, but others are exactly the same.
eg: doc['departure_city_ids'].values is change to doc['departure_city_ids'].
The sort script in 7.4.2 is below. The ES took time is about 165ms.
"sort": [
{
"_script": {
"script": {
"source": "boolean hasDepartureCity(org.elasticsearch.index.fielddata.ScriptDocValues.Strings values,String city){int low = 0;int high = values.size() - 1;while (low <= high){int mid=(low + high) >>> 1;String midVal = values.get(mid);int cmp = midVal.compareTo(city);if (cmp < 0){low = mid + 1;}else if (cmp > 0){high = mid - 1;}else{return true;}}return false;}long getScore(org.elasticsearch.index.fielddata.ScriptDocValues.Longs scores,long lowScore,long highScore){int low = 0;int high = scores.size() - 1;while (low <= high){int mid=(low + high) >>> 1;long midVal = scores.get(mid).longValue();if (midVal < lowScore){low = mid + 1;}else if (midVal > highScore){high = mid - 1;}else{return scores.get(mid).longValue()-lowScore;}}return 1L;}double total = _score;ArrayList sortedCities = params.sortedCities;HashMap cities = params.cities; String city = '';if (doc.containsKey('departure_city_ids')){for (i in sortedCities){ if(hasDepartureCity(doc['departure_city_ids'], i)){city = i; break;}} if (city == '' && doc['departure_city_ids'].length > 0){ city =doc['departure_city_ids'][0];}}String bk = params.tab + city;double hiveScore = 0;if (doc['sh_score_set'].size() == 0) return 0;if (city!='' && doc.containsKey('sh_score_set') && doc['sh_score_set'].value>0){long lowScore = (Long.valueOf(city,10).longValue()<<43);long highScore = lowScore + 100000; hiveScore += getScore(doc['sh_score_set'],lowScore,highScore);}total += (hiveScore* (cities.containsKey(city) ? cities[city] : 1));return total;",
"lang": "painless",
"params": {
"sortedCities": [
"2",
"13",
"17",
"213",
"86",
"375"
],
"cities": {
"2": 1000,
"13": 100,
"17": 50,
"86": 5,
"213": 10,
"375": 5
},
"departureCityPoi": 2,
"tab": "default_scores.",
"destCity": "0",
"boostOfTabVip": 0
}
},
"type": "number",
"order": "desc"
}
},
{
"_score": {
"order": "desc"
}
}
],
The sort script in 5.3.2 is below. The ES took time is about 29ms.
"sort": [
{
"_script": {
"script": {
"inline": "boolean hasDepartureCity(org.elasticsearch.index.fielddata.ScriptDocValues.Strings values,String city){int low = 0;int high = values.size() - 1;while (low <= high){int mid=(low + high) >>> 1;String midVal = values.get(mid);int cmp = midVal.compareTo(city);if (cmp < 0){low = mid + 1;}else if (cmp > 0){high = mid - 1;}else{return true;}}return false;}long getScore(org.elasticsearch.index.fielddata.ScriptDocValues.Longs scores,long lowScore,long highScore){int low = 0;int high = scores.size() - 1;while (low <= high){int mid=(low + high) >>> 1;long midVal = scores.get(mid).longValue();if (midVal < lowScore){low = mid + 1;}else if (midVal > highScore){high = mid - 1;}else{return scores.get(mid).longValue()-lowScore;}}return 1L;}double total = _score;ArrayList sortedCities = params.sortedCities;HashMap cities = params.cities; String city = '';if (doc.containsKey('departure_city_ids')){for (i in sortedCities){ if(hasDepartureCity(doc['departure_city_ids'].values, i)){city = i; break;}} if (city == '' && doc['departure_city_ids'].values.length > 0){ city =doc['departure_city_ids'].values[0];}}String bk = params.tab + city;double hiveScore = 0;if (city!='' && doc.containsKey('sh_score_set') && doc['sh_score_set'].value>0){long lowScore = (Long.valueOf(city,10).longValue()<<43);long highScore = lowScore + 100000; hiveScore += getScore(doc['sh_score_set'].values,lowScore,highScore);}total += (hiveScore* (cities.containsKey(city) ? cities[city] : 1));return total;",
"lang": "painless",
"params": {
"sortedCities": [
"2",
"13",
"17",
"213",
"86",
"375"
],
"cities": {
"2": 1000,
"13": 100,
"17": 50,
"86": 5,
"213": 10,
"375": 5
},
"departureCityPoi": 2,
"tab": "default_scores.",
"destCity": "0",
"boostOfTabVip": 0
}
},
"type": "number",
"order": "desc"
}
},
{
"_score": {
"order": "desc"
}
}
],
I also post images of text, the picture is as follows , this is the process I have tested