I'm designing a system which will have ~2M documents and 95% percent of the queries will be targeted against 40-50k documents.
Documents in the system have two states bool fields
I have ~50k in the state
active and the rest 1.95 million are
So my questions is does it make sense to separate these documents? How can I separate them? Do I need to worry about this at all? Maybe I can use shards and a hash function which chooses buckets depending on the state active/inactive?
I can also introduce two indexes, one which holds
active and one for the
inactive, that way most of the queries will do a lookup in small database. This solution however will require more maintenance and can cause more headaches in the long run.
I'm not fully aware of the capabilities provided by ES and I would love to hear someone with more experience.