I have a use case for percolator where we are only storing about 50-100 queries.
However, we may have a lot of concurrent users each sending anywhere from 200-3000 documents to be percolated at a given time. Initial testing shows response times increasing rapidly as these documents are sent to be percolated. I'm having trouble figuring out how to scale the cluster for this particular use case.
Considering we have a low number of queries, I would imagine we would want 1 shard, then scale out many replicas across a number of instances to handle all of the requests coming through. Will adding more replicas and instances allow more concurrent users to send their documents to be percolated? Can anyone guide me on how to scale this use case out? Thanks