Shards/Routing Design for my use case

I currently have a system which consists of many Lucene indexes and allows
users to search over a user-defined subset of these indexes. It works
(surprisingly?) well, but I am migrating over to ElasticSearch for scale on
a cluster.

Some stats on the system:
~2500 Lucene Indexes
~1M (small) documents per index
~15 new indexes added each month

Let's assume that there is an index for each student in the current system.
Each of these students can be categorized into one of 3 majors: English,
History, Computer Science (not my real use case but this is easier to

In migrating this system over to ElasticSearch I was considering keeping
the pattern of each student having their own shard (in the case of ES) but
after listening to Shay Banon's talk on the "kagillion" shards problem (tm)
I am thinking now that it is not the right approach.

It sounds like the better approach would be to create a single index
(students) and use routing to route all the documents for a given student
to the same shard, and then create aliases with filters.

My question is, would there be any advantage to creating 3 indexes
(english, history, computer_science) instead of just a single (students)

If 50% of the students are English majors, 45% are History majors and 5%
are Computer Science majors would it then make more sense to create the 3
indexes instead of the single index because I could then allocate more
shards to english and history than I do computer_science?

I guess I'm not clear on under what circumstances it is better to create
multiple indexes over a single index.



You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to
For more options, visit