Graylog vs elasticsearch capacity planning


I'm new to elasticsearch :).

i'm actually planning to deploy a graylog instance for managing about 100GB / day of log and keep them for a year, so a total of 36/40TB of Log (~5000 msg/s).
The main usage of the solution will be to index log everyday, with some dashboards and smarts alerts on the last 24h.
The other usage by 2 or 3 peoples search over multiple week of log.

So i don't really need High availability (graylog server have some cache to handle elastic unavailability), i only need to index and access a big amount of data.

Is it possible to get a signle elastic search big server bi-18core 253MB ram and 50TB of storage?
My goal is to simplify maintenance and limit price of solution.


Hello! It's possible, but ES has some "hidden" limitations. For my setup one ES instance (HEAP 31GB) can handle 5-6TB of indexed data (it's almost eated by terms_in_memory and other internal stuff). So you should start 5-6 ES instances in one server to handle 30-40TB of indexed data. You'd better test on real data and see if you affected or not. Another caveat is you'll get one single point of failure (replicas on one server is useless) so better use 3-4 middle servers to get more stability.

Ok thanks for your response, perhaps i can transform my big server into hypervisor and get 5-6 VM (instance of standalone ES server).

For HA it's better to use different h/w boxes or cloud VMs. If you have one server it's easier to use docker or start multiple ES on different ports (see link).

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.