Performance issues after upgrading ES and Nest to 5.4 from 1.7

We have just upgraded from 1.7 to 5.4.1 and since the upgrade, we are seeing quite a big performance hit on some pages. I've been trying to diagnose the issue but I am coming up short currently and would like some help to point me in the right direction.

The difference doesn't seem to be in the query execution itself. The "took" field in 5 is showing more or less the same small numbers as it was prior to the upgrade. However, the slowness happens during the call to the client somewhere. This is what I am referring to:

var client = GetClient();
var start = DateTime.Now;
var searchResponse = client.Search<TSearchDto>(c => c.SearchType(SearchType.QueryThenFetch).Size(0).Index(GetIndexName()).Query(q => query));
var timer = (DateTime.Now - start).Milliseconds;

I've even stripped the query back to a really basic term query on an int field:

Query<SearchDto>.Term(t => t.PortalId, 1);

What I see when I inspect that is 1ms on the took field of the search response but about 40ms on the timer. Doing the same thing on the code prior to the upgrade gives me 1ms for the elapsed time on the query and 2ms on the timer. That's a pretty huge difference.

I cannot tell where that extra time is coming from. Both old and new instances are running side by side on my local machine so there should not be any difference in the connection time and is is a count query so I wouldn't have though loading out the data would be an issue here either.

Any help at all would be greatly appreciated, even if it is just to point me in the right direction.

Are you timing the first request only?

The client is suppose to be shared across requests and there is a first request hit penalty while the client and JSON.NET build their caches.

We have an integration tests for this as well:

Although the test asserts <5ms the overhead is typically <0.01ms

No I always disregard the first run. I refresh the page a few times to get an average.

Ok we can rule that out then:

Keen to dive to the bottom of this

Can you share your SearchDTO type(s)?
Are you able to share your request and response?
What are your ConnectionSettings ?

With those we should be able to rule out serialization issues if any.

Which version of .NET are you using and on what platform?
Can you change your profiling to use Stopwatch (not expecting any major difference but for completeness sake)

Appreciate you looking into this Martjin.

Here is the mapping (the real one is more complicated but the issue is present with this simplified form):

{
   "item-201706191423": {
      "mappings": {
         "searchdto": {
            "dynamic": "false",
            "_all": {
               "enabled": true
            },
            "date_detection": true,
            "properties": {
               "AttachmentString": {
                  "type": "text",
                  "norms": false,
                  "fields": {
                     "ngram": {
                        "type": "text",
                        "norms": false,
                        "analyzer": "ngramanalyzer"
                     },
                     "sort": {
                        "type": "text",
                        "norms": false,
                        "analyzer": "lowerkeyword"
                     }
                  }
               },
               "Author": {
                  "type": "text",
                  "norms": false,
                  "fields": {
                     "ngram": {
                        "type": "text",
                        "norms": false,
                        "analyzer": "ngramanalyzer"
                     },
                     "sort": {
                        "type": "text",
                        "norms": false,
                        "analyzer": "lowerkeyword"
                     }
                  }
               },
               "CreatedBy": {
                  "type": "keyword",
                  "null_value": ""
               },
               "Description": {
                  "type": "text",
                  "norms": false,
                  "fields": {
                     "ngram": {
                        "type": "text",
                        "norms": false,
                        "analyzer": "ngramanalyzer"
                     },
                     "sort": {
                        "type": "text",
                        "norms": false,
                        "analyzer": "lowerkeyword"
                     }
                  }
               },
               "Id": {
                  "type": "keyword",
                  "null_value": ""
               },
               "Keywords": {
                  "type": "text",
                  "norms": false,
                  "fields": {
                     "ngram": {
                        "type": "text",
                        "norms": false,
                        "analyzer": "ngramanalyzer"
                     },
                     "sort": {
                        "type": "text",
                        "norms": false,
                        "analyzer": "lowerkeyword"
                     }
                  }
               },
               "OwnerId": {
                  "type": "keyword",
                  "null_value": ""
               },
              "PortalId": {
                 "type": "long"
               },
               "Tags": {
                  "type": "text",
                  "norms": false,
                  "fields": {
                     "ngram": {
                        "type": "text",
                        "norms": false,
                        "analyzer": "ngramanalyzer"
                     },
                     "sort": {
                        "type": "text",
                        "norms": false,
                        "analyzer": "lowerkeyword"
                     }
                  }
               },
               "Title": {
                  "type": "text",
                  "norms": false,
                  "fields": {
                     "ngram": {
                        "type": "text",
                        "norms": false,
                        "analyzer": "ngramanalyzer"
                     },
                     "sort": {
                        "type": "text",
                        "norms": false,
                        "analyzer": "lowerkeyword"
                     }
                  }
               }
            }
         }
      }
   }
}

Here are the connection settings:

var node = new Uri(elasticsearchUrl);
var setting = new ConnectionSettings(node);
setting.EnableDebugMode();
setting.DefaultFieldNameInferrer(p => p);

Here is the debug info for the request and response:

Valid NEST response built from a successful low level call on POST: /item-read/searchdto/_search?search_type=query_then_fetch
# Audit trail of this API call:
 - [1] HealthyResponse: Node: http://localhost:9201/ Took: 00:00:00.0379998
# Request:
{
  "size": 0,
  "query": {
    "term": {
      "PortalId": {
        "value": 0
      }
    }
  }
}
# Response:
{"took":0,"timed_out":false,"_shards":{"total":3,"successful":3,"failed":0},"hits":{"total":0,"max_score":0.0,"hits":[]}}

We are running on .Net 4.7 on a Windows 10 machine.

I hadn't noticed that before but the took on the audit trail is showing the full time (switched to Stopwatch and that showed 37ms for this request, just like the audit trail)

If you need any further info, just let me know.

Here is our index settings info too in case that helps:

"settings": {
         "index": {
            "number_of_shards": "3",
            "provided_name": "item-201706191423",
            "similarity": {
               "default": {
                  "type": "fixed_similarity"
               }
            },
            "creation_date": "1497878587430",
            "analysis": {
               "filter": {
                  "word_delimiter_filter": {
                     "type": "word_delimiter",
                     "split_on_case_change": "false"
                  },
                  "english_stemmer": {
                     "type": "stemmer",
                     "language": "english"
                  },
                  "unique_stem": {
                     "type": "unique"
                  },
                  "edge_ngram": {
                     "type": "edge_ngram",
                     "min_gram": "3",
                     "max_gram": "20"
                  },
                  "english_stop": {
                     "type": "stop"
                  }
               },
               "analyzer": {
                  "lowerkeyword": {
                     "filter": [
                        "lowercase"
                     ],
                     "type": "custom",
                     "tokenizer": "keyword"
                  },
                  "default": {
                     "filter": [
                        "lowercase",
                        "word_delimiter_filter",
                        "keyword_repeat",
                        "english_stop",
                        "english_stemmer",
                        "unique_stem",
                        "unique"
                     ],
                     "char_filter": [
                        "html_strip"
                     ],
                     "type": "custom",
                     "tokenizer": "standard"
                  },
                  "ngramanalyzer": {
                     "filter": [
                        "lowercase",
                        "word_delimiter_filter",
                        "edge_ngram",
                        "english_stop",
                        "english_stemmer",
                        "unique_stem",
                        "unique"
                     ],
                     "char_filter": [
                        "html_strip"
                     ],
                     "type": "custom",
                     "tokenizer": "standard"
                  }
               }
            },
            "number_of_replicas": "0",
            "uuid": "cBWdhbXURQacQRRtn5KFlA",
            "version": {
               "created": "5040199"
            }
         }
      }

We have a custom similarity provider plugged in but I have tried disabling it and it made no difference.

We have done some further investigations and while we haven't found a solution yet, we did find some interesting things that may help.

We rolled back our code to just before the upgrade (framework, libraries, etc are all the same) but left the existing ES5 cluster in place. When we ran the query against it with Nest version 1, the slowness is not happening. That narrows it down to the Nest request/response parsing code I believe.

However, we have spun up a new app using version 5 and pointed it at the same cluster and that is not experiencing the slowness. Kept the DTO the same and used the same query. Even added the same libraries to the project.

I will carry on investigating to see if I can get any further with it.

Edit: also found that upgrading the Nest 2 causes the problem.

I have managed to run a trace on the code to try to find the cause and it does look like a serialization issue. See below for the traces.

Nest v5

Nest v1

So I have finally managed to fix this. Basically, it was due to the way we reference the ElasticClient. We have an ElasticSearchIndexManager class that looks like this:

public class ElasticSearchIndexManager : IElasticSearchIndexManager
{
    private ElasticClient _client;

    public ElasticClient GetClient()
    {
        return _client ?? (_client = ElasticSearchClientFactory.GetClient());
    }
}

For some reason, registering the class as a singleton or making the _client field static completely fixed the issue for us. Previously, the client was getting created new each request. That simple change meant that the calls in the code to JSON.Net to serialise the SearchRequest objects took less than 1ms rather than 40ms. Turns out, while that wasn't an issue in version 1, it causes a big hit from version 2 up.

Glad you managed to get it sorted @Barry_Woods :thumbsup:

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.