Volume Sizing Example

Each node has a certain amount of resources available, e.g. CPU, heap space and disk I/O capacity. Indexing, querying as well as just storing data use resources, and therefore compete for resources with eachother.

Indexing is a very disk I/O intensive process but can also use a lot of CPU and need a good amount of heap space as well. Querying also uses the same resources and the amount of resources required depend on the required query latency as well as the amount of data queried. Just storing data on a node typically just consume heap.

This is based on empirical observations and is what a lot of users use for hot nodes. Holding relatively little data means querying and storage require less resources, which leaves more for indexing. It is all about finding a good balance between indexing, storage and querying.

If you have 3 nodes instead of 15, each node will need to index 5 times as much. They will also store 5 times as much data and handle querying for 5 times the data volume. At the ingest volumes and retention period you mentioned I suspect you will run into resource limitations at a very early stage.

You can also have a look at this webinar, which talks about heap usage and how to optimize it.