How many indexes is too many indexes

I recently decided that it might be a good idea to change the structure of our elasticsearch indexes for performance. Right now we have a single index and two types and our corresponding graph has a database for each “project”. Though they aren’t added that often its possible that a infinite number of projects could be added in the future.
Logically it makes sense that each index in ES should be a ‘project’, but that seems to counter the documentation “For that reason, a single large index is more efficient than several small indices: the fixed cost of the Lucene index is better amortized across many documents.” I'm confused on how to optimize our performance. Any suggestions?
Thanks!
Edit: I'd like to add that we don't know how large a index will become when we create it (so balancing the shards is also a issue).

It's really shards that are the (potential) problem. If you have >400 (that's a pretty vague and rough number based on annecdata) per node you will be wasting heaps of heap on maintaining those.
Basically _shrink is your friend here, you can start relatively large and then reduce the shard count.

Back to your domain though - what sort of data is this?

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.