Highlighting with fvh and fuzziness taking long time in Elasticsearch


(Nikesh) #1

Hi all,
I have indexed a single document with more than 150 metadata, each with mapping:

           "ACTIVE": {
                                "type": "text",
                                "term_vector": "with_positions_offsets",
                                "fields": {
                                    "autocomplete_analyzed": {
                                        "type": "text",
                                        "analyzer": "autocomplete"
                                    },
                                    "keyword": {
                                        "type": "keyword",
                                        "ignore_above": 256
                                    }
                                }
                            }

and with setting,:

    "analysis": {
                    "analyzer": {
                        "autocomplete": {
                            "filter": [
                                "lowercase"
                            ],
                            "tokenizer": "autocomplete"
                        }
                    },
                    "tokenizer": {
                        "autocomplete": {
                            "min_gram": "3",
                            "tokenize_on_chars": [
                                "whitespace",
                                "letter",
                                "digit"
                            ],
                            "type": "edge_ngram",
                            "max_gram": "7"
                        }
                    }
                }

I have used terms _vector to be able to use fast vector highlighting in my query.
My query:

{
  "from": 0,
  "size": 24,
  "query": {
    "bool": {
      
      "should": [
        {
          "multi_match": {
            "query": "current",
            "type": "best_fields",
            "fields": []
          }
        },
        {
          "query_string": {
            "query": "*current*",
            "fields": []
          }
        },
        {
          "multi_match": {
            "query": "current",
            "fuzziness": "1",
            "fields": []
          }
        }
      ],
      "minimum_should_match": 1
    }
  },
  "highlight": {
    "type": "fvh",
    "fields": {
      "*": {}
    }
  }
}

My query demands fuzziness, wildcard and phrase matching.
Fuzziness and wildcard is disabled or enabled depending on my requirement on back end. But on Free text search, I have to enable both of it including highlighting.

  1. With Highlighting, my query takes more than 15000 ms but without highlighting it takes 800ms
  2. Without fuzziness and with highlighting it takes around 1200ms and without fuzziness and highlighting it takes around 500ms.

The slowness in the query is due to fuzziness and highlighting working together.
How Highlighting with terms vector index the document? why is the query running so slowly? Is it due to my query or is it due to indexing data? Because, I will be working on millions of documents.
What's the best way to go about this time problem?


Multi-Fields search using Span Queries with fuzziness in Elasticsearch
Multi-Fields search using Span Queries with fuzziness in Elasticsearch
(Nikesh) #2

@Mark_Harwood Hi, This is the query I was referring to in the other post


(Nikesh) #3

@elastic


(system) #4

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.