Recommended Setup and Settings for Very Large Data


Just want to ask your help what's the recommended cluster setup and settings for the data below:
To index: 32 billion records (around 7 TB)
Node specs: 8 CPUs x 64 RAM x 1.7 TB disk
Query complexity: 5 levels "has_child"
Queries include terms, range(number, date) and aggregations (date histogram, terms)

The ask:
- No. of nodes
- No. of primary shards
- No. of replicas
- other search optimization config
- etc

Thanks in advance!

You will need to benchmark to find out. I doubt anyone will be able to tell you with any accuracy. Using such deeply nested documents seem sound potentially problematic or inefficient. How have you determined that this is the optimal data model for the use case?

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.