we recently upgraded our ES logging cluster from 2.X to 5.4 and implemented a distributed (3 x client, 3 x master and 6 x data nodes on separate machines) architecture.
Before we ran everything on 6 x all-purpose nodes. In terms of hardware they are the same though. We imported data to the new cluster using ES snapshots, so index settings, like
number of shards are the same as well.
We now see worse query duration times, especially when querying large time periods via Kibana.
default Kibana query, not searching for anything, and querying the last 60 days of an index pattern
NEW CLUSTER: ------------ Query Duration 9778ms Request Duration 9992ms Hits 760170271 OLD CLUSTER: ------------- Query Duration 7730ms Request Duration 10002ms Hits 755127976
Comparing the node load while the query runs, we see the Process CPU of the new cluster rising to 100% on all data nodes, while the old cluster merely reaches ~85%. (remember, the hardware is the same)
I'm happy to provide other metrics for you guys.
Thanks for your help in advance