Hardware for searching and visualizing 3 billion documents?

Hi,

I know this might be a bit of a vague question, but hopefully I can get some insight into the hardware I'll need.

I'm collected netflow data and use Kibana to make visualizations. My dashboard has about 20 visualizations, bars, pie charts and tables. Most of it is are sums of total traffic per port/application/ip and sums of total data usage per day/month/week and and sum of data usage per ip per day/month/week.

This is the time a request takes for a single pie graph summing total.bytes per ip/port.

Hits 13885344
Query time 3705ms
Request time 4895ms

This is the same visualization but inside my dashboard.

Hits 13885180
Query time 14151ms
Request time 21212ms

So far I've noticed performance is scaling fairly linear. E.g. 60 days of data will take about twice as long to load than 30 days of data.

Based on my sample data I think I could end up with ~3.000.000.000 documents. Why kind of cluster would I need to be able to search through that data with somewhat acceptable performance?

Right now I'm running a single 15GB instance on Elastic cloud, weekly indeces and 1 shard per index. I'd say performance is reasonable at the moment (around 25 seconds to fully load a dashboard). But if I'd need to search ~200 times the amount of data I have right now, what kind of cluster would I be looking at?

My understanding is more shards spread over multiple instances will increase performance because searches will run in parallel. How about more shards per index on the same instance? Will that increase performance as well? How linear is performance scaling when you add an instance to a cluster (e.g. will double the instances give ~2 the performance?)

I'm not sure how to check server utilization on the Elastic cloud but before I had everything running on a AWS instance (16GB, 4 core) but as far as I could tell CPU utilization only spiked for a couple of seconds during a search.

The above example is a worst case scenario where all the data would be displayed. A more realistic use case is where the same amount of documents will be searched, but only 1/100th of the data is needed for aggegrations etc. (filtered based on the location of my devices)

TL;DR: What kind of cluster would I be looking at if I need to search and visualize ~3 billion documents in 1 ~ 2 minutes?

What is your current shard size? The you say you expect 200 times the amount of data, how much of that is increased daily ingest versus increased retention period?

Retention period will not change (90 days). I'm currently using weekly indices and each shard (weekly) is ~150MB. I went with weekly indeces to stay within the 20 ~ 50GB recommended limit for an index based on the total amount of data I eventually plan to have.

By the way, this are the Elastic stats. Even opening up a bunch of dashboards at the same time and even though they generate visualization timeouts, it doesn't appear Elastic is running out of resources.

edit: under advanced it does show 100% cpu utilization. Which one is correct?

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.