We are using 3 node (master+data) cluster
AWS LoadBalancer (Hardware Machine) in front of our 3 node cluster.
- Ram: 32 GB (50% to ES and remaining 50% to OS)
- Cores: 16 for each node
- Shards: 3
- Replicas: 1
- Index: 1
- Queries: Basic Queries (No wildcard, aggregations etc.)
- Analyzers: Yes (4)
- Tokenizers: Yes (2)
- Filter: Yes (2)
- ngrams: Yes (both front and back ngrams are used)
- Total Data-Size: < 200 MB (Very Small and it will always be small)
With this we can serve 2500 QPS (Query Per Second), now we want to achieve 1 LAC QPS
So my question is
- Do we need to increase replica to achieve our goal?
- Do we need to increase Shards from 3 to 12, to achieve our goal?
- Do we need to do both 1 & 2 to achieve our goal?
- Do we need to increase nodes from 3 to 12 to achieve our goal?
- Do we need to avoid AWS LB and start using ES built-in load-balancer (client node), then our topology will become 12 node cluster with 2 client nodes, 7 master nodes and rest acting as data nodes, will it achieve our goal?
Please don't say try all and figure out yourself, because i cannot do that for certain reasons, if at all anyone has faced and/or solved this problem, help to understand and solve this scaling problem.
Lets Learn And Grow Together.