Simple queries takes lots of time and uses 100% cpu

aimarjg · November 14, 2019, 9:25am

@DavidTurner: that's exactly what I'm saying - queries executed in devtools are done in seconds, while in discovery tab it takes ~10 times more.

Test done right now:

1. Query in discovery tab:
Request code: https://raw.githubusercontent.com/aimarpl/kb_queries/master/query_discovery
While executing, all CPUs at 100%:

Result after ~15 seconds (first query failed after two minutes, so I had to re-run it and then got response in 15 secs):
query_discovery_result

2. Query in devtools tab:
Request code: https://raw.githubusercontent.com/aimarpl/kb_queries/master/query_devtools

While executing & result (result provided almost instantly with almost 0 CPU load):

It's crazy...

aimarjg · November 14, 2019, 9:48am

I have even disabled highlighting in kibana settings to check on performance then, but... it has no impact at all (it was suggested in one of answers above).

Mark_Harwood · November 14, 2019, 9:58am

I see the discover vs console examples are querying different date ranges and timestamps - to further diagnose we need to get these requests as close to each other as possible.
The other thing to ensure is that caching is not taking effect. In console you can do this:

  GET myindex/_search?request_cache=false

Another difference between Discover and Console is that Discover will route user requests to the same choice of replica using the preference parameter (you can see this value using the Browser's network tools).
In your Console test, set ?preference=xxxx with the same session id used by the Discover tab.

aimarjg · November 14, 2019, 11:21am

Good catch, indeed timestamps were a little bit different...

I just did another test applying your suggestions into it:

Query from Discovery
Request code: https://raw.githubusercontent.com/aimarpl/kb_queries/master/discovery_query2

Execution:

Result:

It took 16 seconds

Query from dev tools:
Request code: https://raw.githubusercontent.com/aimarpl/kb_queries/master/devtools_query2

Result:

It took around 1-2 seconds

about replicas used - It's a single node with 5 shards and just 1 replica so I believe it does not matter, does it?

Mark_Harwood · November 14, 2019, 11:32am

Indexes are created daily (winlogbeat-YYYY-MM-dd) wi with 1 shard, 0 replicas.

It's a single node with 5 shards and just 1 replica

Are we talking about a different setup now?

aimarjg · November 14, 2019, 11:35am

No, yes... a little Few posts above I tried to reconfigure it from 1 shard and 0 replicas to 5 shards and 1 repl - I thought that maybe then it will run some search jobs in pararell increasing the overall performance, but... it had no impact.

Mark_Harwood · November 14, 2019, 12:09pm

If you only have a single node you'll never get any replicas (there's no added resilience or throughput benefits in allocating replica shards to the same node as the primary shard).

So I guess we can rule out any preference difference between Discover and Console.

It might be worth noting Kibana's description of timings. I wonder if there's added overhead in the unpacking of all that JSON in the 500 doc results (did we already establish performance was linear with the size setting for hits?)

aimarjg · November 14, 2019, 12:36pm

ah, then, basing on last query output it looks like
ES query takes only 54ms - super fast
and then preparing all json docs takes 16460ms ...correct?

I would test that with different sizes as you proposed, but doing that in devtools/console would not give me results I'm looking for (preparing docs).
If I do that from the discovery tab in kibana I can change size in advanced settings, but.... results are cached and each subsequent query for same data is faster. How do I disable cache over there?

Mark_Harwood · November 14, 2019, 12:56pm

If the suspicion is that it's all the JSON marshalling that takes the time the elasticsearch caches may be somewhat irrelevant. That said, there's an API you can use to clear the caches

aimarjg · November 14, 2019, 1:49pm

ok, good hint

first query, 500 results

query time: 170ms
request time: 27339ms

second query after clearing the cache with following:
POST /winlogbeat-2019.11.14/_cache/clear
POST /winlogbeat-2019.11.13/_cache/clear

Result:
# POST /winlogbeat-2019.11.14/_cache/clear
{
"_shards" : {
"total" : 5,
"successful" : 5,
"failed" : 0
}
}

# POST /winlogbeat-2019.11.13/_cache/clear
{
  "_shards" : {
    "total" : 5,
    "successful" : 5,
    "failed" : 0
  }
}

Query time: 129ms
Request time: 634ms

so... even after clearing cache, second query is very fast (for the same data); how is that possible?!

Mark_Harwood · November 14, 2019, 2:04pm

There's OS-level filesystem caches to consider too but that does look like a suspiciously large difference.

Alfredo5520 · November 23, 2019, 7:49am

After analyzing the website type and traffic, we tweak the key settings in MySQL servers that helps to ensure optimal resource usage. This includes connection limits, buffer size and query cache size, and more.

aimarjg · November 30, 2019, 7:44pm

@Mark_Harwood, @DavidTurner - first of all, thank for you all your hints here finally some good news here
Got it fixed - I have disabled filterEditor:suggestValues in Kibana settings so KQL won't suggest values in discovery search and problem is gone.

Finally it all works as it should, question is - shouldn't this feature get reported as a buggy? As I read in other threads it looks up all indicies instead of those specified in time range picker which causes such CPU load.

Anyway - I hope that this workaround will help others having same issue with ELK 7.

Mark_Harwood · December 2, 2019, 9:38am

Glad to hear you identified the bottleneck. I added a note to the relevant Kibana issue.

system · December 30, 2019, 9:38am

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Performance issues that make no sense Kibana	10	1244	July 28, 2017
Search Behavior/Performance Kibana	5	413	March 21, 2018
Our elastic search query performance is VERY low Elasticsearch	12	1586	May 11, 2017
Query duration too high for ~200G logs a day Elasticsearch	7	995	September 25, 2017
Kibana 4 Running ordinary search so so slow Kibana	16	6192	July 6, 2017

Simple queries takes lots of time and uses 100% cpu

Related topics