Real netflow monitoring for 10 G traffic

Dear ESs :slight_smile:
we are in the phase of selecting technology for having a custom Netflow collector,
we are expected to deal with almost 4 mil record per minute. has anybody a use case for that? it shall be a local installation, not a cloud one. and it has to be 1 box only.

can ES do it? and what is the best setup for that ? are there a magic numbers for performance tuning ?

what if I would keep this data for 1 year?

4m / minute is roughly 66k / s, which is in the ballpark of what ES can handle. But it really depends on node configuration, the size and mapping of your documents, latency SLAs, etc etc.

I'd recommend setting up a realistic Rally test(github link) to help benchmark your setup.

You may also be able to find some useful information in these perf tuning articles, although be warned they are starting to get a little outdated:

How retention affects your node depends largely on A) how much disk space you'll need and B) how often you query old data. If historical data mostly sits idle and is never queried, it won't have much impact on indexing speed. But if you're frequently querying old data, it'll have a larger impact and you'll need to size more appropriately (potentially with multiple nodes).

E.g. 4m docs / min == 2,102,400,000,000 docs / year aka 2 trillion docs. If we're conservative and say each doc only uses 48 bytes (I don't know how big netflow records are, but I'm guessing they are pretty small), that's 100Tb of storage for a single node. Even assuming ES can do something crazy like 50% compression, it's still 50Tb. So I think you'll need to either reduce your retention period, or start expanding to multiple nodes.

1 Like

WoW, tis was fast Thanks @polyfractal

yes I agree with you that the old data would be a problem in this case. you are totally right.

you gave me a lot of helpful infos, let me do my home work then come back for more questions if you do not mind :slight_smile:
Many thanks

Happy to help :slight_smile: Goodluck!