Slower query performance after upgrade 7.11 -> 7.14

I'm facing slower performance for 'match' search query after upgrade ES version to 7.14.
The query applies to data with ngram(1,2) analyzer.

The drop of search performance appears between ES version 7.11 <> 7.14 (the newer is slower)

After some analysis came out that in profiler "match" timing makes the difference.

Questions:

  • Why is the "match" timing higher for search queries in ES-7.14 ?
  • Were there some changes between ES-7.11 <> 7.14 - which can affect?
  • How to get equal (similar) times for this query in newer ES-7.14 version ?

Here an example to reproduce the behavior

A document with one property "Name", containing joined first + lastname (eg "James, Baker").
The "Name" has assigned ngram analyzer. The data are stored in index called "simple".

Mapping definition (mapping.yaml)

dynamic: false
properties:
  Name:
    type: text
    analyzer: onetwogram_analyzer

The index settings with ngram analyzer (settings.yaml )

index:
  refresh_interval: -1
analysis:
  tokenizer:
    onetwogram_tokenizer:
      type: ngram
      min_gram: 1
      max_gram: 2
  analyzer:
    onetwogram_analyzer:
      tokenizer: onetwogram_tokenizer

Document example

{
  "_index": "simple",
  "_type": "_doc",
  "_id": "7dxi9YABn-BLvzF_1E0I",
  "_score": 1,
  "_source": {
    "Name": "Padron, Briana"
  },
  "fields": {
    "Name": [
      "Padron, Briana"
    ]
  }
}

The search query is as follow

{
    "size": 100,
    "query": {
        "match": {
            "Name": {
                "query": "Forsythe, Shawanna",
                "minimum_should_match": "75%"
            }
        }
    }
}

Test scenario (5x rounds)

  • create index 'simple' and load data with 1 million documents (only for first run)
  • refresh index
  • run search queries 1000x times with randomized values
  • collect time + hits count (summarized)

Setup:

  • PC: 10x cores, 32GB memory
  • Docker: 12GB memory
  • ES + KB Container: default + (single-node; security-disabled)

Results (per round of 1000x queries)

ES-7.11 (avg. 25 sec / per single query)
Simple: Took: 25477 ms, # Results: 483340
Simple: Took: 24735 ms, # Results: 483340
Simple: Took: 24709 ms, # Results: 483340
Simple: Took: 24556 ms, # Results: 483340
Simple: Took: 25048 ms, # Results: 483340
ES-7.14 (avg. 44 sec / per single query)
Simple: Took: 46166 ms, # Results: 483340
Simple: Took: 44123 ms, # Results: 483340
Simple: Took: 43894 ms, # Results: 483340
Simple: Took: 44087 ms, # Results: 483340
Simple: Took: 44085 ms, # Results: 483340
See single query response
{
    "took" : 46, 
    "timed_out" : false,
    "_shards" : {
      "total" : 1,
      "successful" : 1,
      "skipped" : 0,
      "failed" : 0
    },
    "hits" : {
      "total" : {
        "value" : 55,
        "relation" : "eq"
      },
      "max_score" : 55.87664,
      "hits" : [ ... ]
}

The single query response is the same result in both versions

  • with exact same 55 hits (count, score, order)

Analsysis

The profiler analysis of the search-query points that ES-7.14 timing on 'match' is higher.
For the ES-7.11 this number is pretty low.

"profile" : {  
    ...
   "query" : [ ..
       
      "match" : 7180, # ES-7.11
      "match_count" : 56,
 vs 
      "match" : 27959515, # ES-7.14
      "match_count" : 245057,
   ]
}

Profiler UI output


According to profiler 'match' timing description :

The time taken to execute a secondary, more precise scoring phase (used by phrase queries)

... and Profiler API > All parameters

Some queries, such as phrase queries, match documents using a "two-phase" process.
First, the document is "approximately" matched, and if it matches approximately,
it is checked a second time with a more rigorous (and expensive) process.
The second phase verification is what the match statistic measures.
...

What could be the cause for such behaviour ?

Thanks in advance for any hint

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.