Found in Full-text search scenario,ES 7 slow than ES 5

Planning to upgrade from ES 5.6.7 to ES7.X,;

During the test, the index data in ES5 was migrated to ES7, and performance comparison was performed in a specific scenario. It was found that the search time of ES7 decreased severely with the increase of search word.
image

cost time detail:
ES5:

"score": 12630451,
"build_scorer_count": 63,
"match_count": 0,
"create_weight": 4035705,
"next_doc": 0,
"match": 0,
"create_weight_count": 1,
"next_doc_count": 0,
"score_count": 8332,
"build_scorer": 591863,
"advance": 36341745,
"advance_count": 54158

ES7:

"set_min_competitive_score_count": 0,
"match_count": 0,
"shallow_advance_count": 0,
"set_min_competitive_score": 0,
"next_doc": 0,
"match": 0,
"next_doc_count": 0,
"score_count": 9276,
"compute_max_score_count": 0,
"compute_max_score": 0,
"advance": 31980019,
"advance_count": 42472,
"score": 74502869,
"build_scorer_count": 56,
"create_weight": 2902138,
"shallow_advance": 0,
"create_weight_count": 1,
"build_scorer": 929773

ES5 version:5.6.7 jdk:1.8.0_131
ES7 version:7.7.1 jdk :14.0.1
The same amount of data, the same machine configuration

I want to know why in the scoring stage, ES7 takes several times longer than ES5. This is a problem with the design of ES7 itself, or because I didn't optimize it properly.
Hope someone can help answer

explaining your use-case, number of documents, your query and mapping might be helpful for others. This does not give any information for further debugging.

As apparently two use-cases got faster, that might be a problem with the query or mapping. Or it might be insufficient warmup, as you also need to explain your testing strategy...

1 Like


In the three test scenarios, scoring time-consuming ES7 is generally slower. With the increase of search terms, the time-consuming is getting more and more serious. The reason why two scenarios are faster is because my search scenario contains part of the precise matching conditions. , ES7's performance improvement in precise matching results. As the search terms increase, the improvement brought by precise matching is not enough to make up for the time-consuming scoring, so it takes longer in the scenario of long search terms

query like this demo

{
    "from": 0,
    "size": 21,
    "timeout": "3s",
    "query": {
        "bool": {

            "should": [
                {
                    "function_score": {
                        "query": {
                            "multi_match": {
                                "query": "部编人教版一年级语文下册11课彩虹的教案",
                                "fields": [
                                    "date^1.0",
                                    "issue^1.0",
                                    "body^3.0",
                                    "volume^1.1"
                                ],
                                "type": "best_fields",
                                "operator": "OR",
                                "slop": 0,
                                "prefix_length": 0,
                                "max_expansions": 50,
                                "lenient": false,
                                "zero_terms_query": "NONE",
                                "boost": 1.0
                            }
                        },
                        "functions": [
                            {
                                "filter": {
                                    "bool": {
                                        "must": [
                                            {
                                                "term": {
                                                    "name": {
                                                        "value": "BMC_Proc_2013_Jul_22_7(Suppl_3)_S1",
                                                        "boost": 1.0
                                                    }
                                                }
                                            },
                                            {
                                                "term": {
                                                    "accession": {
                                                        "value": "$SB0100",
                                                        "boost": 1.0
                                                    }
                                                }
                                            }
                                        ],
                                        "adjust_pure_negative": true,
                                        "boost": 1.0
                                    }
                                },
                                "weight": 5.0
                            }
                        ],
                        "score_mode": "sum",
                        "boost_mode": "sum",
                        "max_boost": 3.4028235e38,
                        "boost": 1.0
                    }
                },
                {
                    "constant_score": {
                        "filter": {
                            "match_phrase": {
                                "body": {
                                    "query": "部编人教版一年级语文下册11课彩虹的教案",
                                    "slop": 0,
                                    "boost": 1.0
                                }
                            }
                        },
                        "boost": 20.0
                    }
                }
            ],
            "adjust_pure_negative": true,
            "minimum_should_match": "1",
            "boost": 1.0
        }
    },

    "sort": [
        {
            "_score": {
                "order": "desc"
            }
        },
        {
            "timestamp": {
                "order": "desc"
            }
        }
    ]
}

I found the problem. The performance of ES7 is lower than that of ES5 in executing the function_score method. If you simply execute multi_match ES7, the performance will be better, but I don’t know why the performance of function_score ES7 is lower. It is found through the profile that the function_score related classes have changed. ES5->FiltersFunctionScoreQuery
ES7->FunctionScoreQuery, so does anyone know the reason for the slowness