Unified Highlighter is too slow

aslamy1 · May 30, 2018, 8:06am

Hi !

When I use Unified Highlighter it takes about 20-22 sec to retrieve the result but when I change it to fvh highlighter it takes about 2-3 sec.
Why does elasticsearch recommend unified highlighter and has it as default highlighter . Performence is too bad.

jimczi · May 30, 2018, 8:42am

Can you provide a small recreation with your mapping, the query that is slow with the unified highlighter and a sample document ? The unified highlighter uses the term vectors when they are activated on the field so it is not expected to be 10x slower than the fvh.

aslamy1 · May 30, 2018, 8:58am

Hi !
I'm not allowed to share the company's code. But everything are exact the same in my test. Fields are indexed with "term_vector": "with_positions_offsets".

The only thing I change is "type": "unified" to "type": "fvh" .
When I search after one or two tokens both almost has same performance but when tokens increases,
the more tokens the slower it will be.

5 tokens takes about 20-22 sec on unified highlighter but 2-3 sec on fvt highlighter

jimczi · May 30, 2018, 10:51am

Can you at least share the query that you used ? How many fields and documents are highlighted and what is the average size of the documents ?

aslamy1 · May 30, 2018, 11:36am

Can you at least share the query that you used ? How many fields and documents are highlighted and what is the average size of the documents ?

Index has 15683 documents and 1.5 GB big.
At index time we do copy 5 fields into one filed called "content".

{
    	"from": 0,
    	"size": 10,
    	"sort": [],
    	"highlight": {
    		"pre_tags": [
    			"<strong>"
    		],
    		"post_tags": [
    			"</strong>"
    		],
    		"fields": {
    			"document.title": {
    				"no_match_size": 512,
    				"number_of_fragments": 0,
    				"type": "unified"
    			},
    			"content": {
    				"fragment_size": 130,
    				"no_match_size": 256,
    				"number_of_fragments": 2,
    				"type": "unified"
    			}
    		}
    	},
    	"query": {
    		"bool": {
    			"must": [{
    				"multi_match": {
    					"query": "word",
    					"operator": "and",
    					"fields": [
    						"document.title^5",
    						"content"
    					]
    				}
    			}]
    		}
    	}
    }

aslamy1 · May 30, 2018, 2:14pm

@jimczi very important discovery:
When I remove "operator": "and" from multi_match query the response time instead of 20-22 sec is 819 millisec

jimczi · May 30, 2018, 2:21pm

The highlighting works only on the top documents (in your example the top 10 documents since you set size to 10) so changing the operator should not impact this phase. However the document that will be returned in the top 10 documents are going to be different so I suspect that you have a very big document (or several) that makes the highlighting slower when you use the query with the and operator. Can you check the size of the document in both cases ?

aslamy1 · May 30, 2018, 2:38pm

@jimczi I think you are right.
When "operator": "and" is set, the size of finded documents are 80mb and when it not set the size is 2mb.
Do you have any suggestion to solve this problem?

fvh highlighter has better Performance on 80mb data than Unified Highlighter

jimczi · May 30, 2018, 2:49pm

We didn't test this extreme case so I'll need to investigate a bit. I'll do some test on my side and will come back here in a bit. Thanks for reporting.

system · June 27, 2018, 2:49pm

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Which is faster FVH or Unified Highlighter? Elasticsearch	1	667	July 7, 2020
Fvh Highlighting taking longer time than unified highlighting Elasticsearch	5	531	December 19, 2018
Elasticsearch Highlighting is very slow Elasticsearch	1	936	January 10, 2019
Unified highlighter performance regression between 6.8 and 7.17 Elasticsearch	3	396	March 31, 2022
Highlighting performance issues with stored field and fvh highlighter Elasticsearch	3	282	March 13, 2024

Unified Highlighter is too slow

Related topics