Hi,
we are considering using ES as a primary data-source for a new project.
our data is generated by millions of different users, each having a
relatively small number of documents, yet each having a different data
schema.
we are considering several approaches:
- index per user - we are concerned with scaling the ES cluster to support
millions of indexes, each having relatively small number of docs. - all users colocated on a single index - we are concerned that an ES index
will not support millions of different fields (as each user has a different
data schema). - mix of the two above - having X users colocated on a single index, and
having Y such indexes to host our entire user population. - implementing some kind of a "mapping layer" that maps users' schema onto
generic fields in one or more indexes.
this would probably work, but of course is harder to implement & maintain.
so my questions:
- are there production deployments out there that have a million active
indexes? what do they look like? - how many different fields does it make sense to host in a single index?
would it scale to millions of fields in a single index? - are there other ways to go about this that we have overlooked?
thanks!!
--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/f98651e6-5e6a-4ed8-aba8-b5e91078f036%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.