Data Tables are very slow, any tips what could help?


#1

Hello,

Running on Kibana and elastic 6.4

I know this is very similar to Kibana visualization is too slow but I would like to ask what can be done, what kind of strategy one might have.

We have several Tables in a single dashboard and all of them shows a count of something, often it is ipaddressen or some other value that we want to count how many we have in this timespan, but as our data grows these tables become very slow even when using smaller timespans.

So I have a dashboard with 3 tables, 1 metric, 2 time series and 1 horizontal bar, and the metric, horizontal bar and the two time series all load very fast (<1 sec during small timeframes like 1-4hours), but the 3 tables can take around 5seconds to load.

If we go up to "extrems", we have a dashboard with 6 tables, and when checking the last 30 days (even tho we only have data for about 10days), we often get the 30sec timeout, now we can make that timeout larger, but that does not solve the main problem with these tables being very slow.

What can we do to speed them up? We have them limited by giving them a filter to use, so they only use small parts of the logs, but they are still very slow.
We could do some pre-aggregation in some outside application, and use another index, but this creates problems with filtering.

Any suggestion is welcome!


(Jon Budzenski) #2

Ah that's always a tough one on the kibana side, the data table queries are fairly rigid. Like you mentioned a pre-aggregation, or when you're on 6.5+ giving rollups a try (https://github.com/elastic/kibana/pull/21117). Are there any scripted fields? Those can be expensive too.

Another approach is to take a look at the cluster health.
How much load is it under? What's the shard allocation look like? Searching over many many shards is going to be expensive. On the other hand, if there's room for adding replicas we can increase read throughput at the expense of writes.


#3

Hello, I understand this is very hard question and very dependent on our setup.

We will "soon" upgrade to 6.5 so I'll look into that link then.

We have no scripted fields, any kind of "scripted fields" behavior that we need we have done in the ingest pipe, in order to try and keep the system fast. Currently we have a very simple setup that we are testing out and it's just one node with every index just has 1 primary and 0 replicas, but we don't have that much data, we get maybe 10gb a day and currently store about 15days, so total storage is about 150gb.

We had less problems when we started out and only had about 3-4days, so the total storage (or amout of logs to go over) seems to be a big problem, understandable.

I have little control or knowledge about how the shards are allocated and why we have 1 primary and 0 replicas, but I can try ask around.

I was thinking that even if we have more data to go over if I limit the search to 1 day it would be fast, but the data tables can still take 2-3seconds to load with just the last hour.


(Christian Dahlqvist) #4

That does not sound like a lot of data. How many indices are you creating per day? What is the specification of the host Elasticsearch is running on? What kind of storage do you have?


(Sjaak) #5

What is the size of your bucket term?

I had (have) the same problem. Elastic performance probably has something to do with it but apparently rendering the data inside Kibana is also quite resource heavy on the client side as its single threaded code. Take a look at your task manager when the table is loading and look at the load on your CPU for your browser's process.

I worked around this issue in two ways:

  1. Limit the size of your bucket term. I found that in my case a bucket size of several hundred is still OK'ish.
  2. Don't use term aggregrations on the timestamp field. If you need some kind of date filtering use date histogram instead.

#6

Not sure what you mean with what kind of storage we have a 500gb disk on a server that just saves the logs, and when we start to get full, it will automatically clean up some space, (Oldest index). I do believe that the server we use is kinda of old and the drive might be oldish also, but since all other visualizations go fast I thought it would be something specific to the data tables or the way we set them up.

We create 3 indices per day, one of them is for our pre-aggregation, this one is very small about 10mb, then we have the main index that has "everything" its 10gb, then there is a minor extra one were we get some extra logs that isn't used, its like 100kb.


#7

Have been exploring different bucket term sizes, we started with 50, gone down to 20, but it doesn't seem to make much difference between those two.

Almost all of them looks something like this:

We have a filter for some log id's that it should look for anything from 1 logid, to 10ish, this should severly limit the amout of logs it has to go over.
Then we have a metric that is either a count of everything (but limited by the filter, right?) or a sum of some field. Then there is a Terms that would then be what metric we want, this is normally a ipaddress field or something similar. Sometimes but not always we have another bucket split for geodata, to get the country for the ipaddress.

I've tried to remove the geodata to speed it up, and it doesn't seem to do much, I havn't done any extensive testing, but the fast tests, to remove geodata from three tables that were on the same dashboard and then try load the page again some times didn't seem to affect it much.

All other visualizations load very fast.


(Sjaak) #8

How about you remove all terms and add them one by one, running a search every time to check performance. Maybe there is one term that is somehow impacting performance.


#9

There are only 2 terms and one of them is vital for this.

I have tried to remove the other but that didn't seem to affect it very much.


(system) #10

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.