we are considering using ES as a primary data-source for a new project.
our data is generated by millions of different users, each having a
relatively small number of documents, yet each having a different data
we are considering several approaches:
- index per user - we are concerned with scaling the ES cluster to support
millions of indexes, each having relatively small number of docs.
- all users colocated on a single index - we are concerned that an ES index
will not support millions of different fields (as each user has a different
- mix of the two above - having X users colocated on a single index, and
having Y such indexes to host our entire user population.
- implementing some kind of a "mapping layer" that maps users' schema onto
generic fields in one or more indexes.
this would probably work, but of course is harder to implement & maintain.
so my questions:
- are there production deployments out there that have a million active
indexes? what do they look like?
- how many different fields does it make sense to host in a single index?
would it scale to millions of fields in a single index?
- are there other ways to go about this that we have overlooked?
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to firstname.lastname@example.org.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/f98651e6-5e6a-4ed8-aba8-b5e91078f036%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.