Heh, confusing subject, sorry.
Our application shards on client: we have a separate Postgres database per
client. So I naturally gravitated towards creating a separate
Elasticsearch index per client. After perusing this group some, I realize
that was a mistake: I now have a single node "cluster" that has over 1000
shards.
I've read some messages suggesting the way to go in this situation is this:
- Create a single index with 20-30 shards (or however large you want
your cluster to be able to grow to). - Create an alias per client with filter on, say, field client_id.
- Optionally specify routing on the alias.
So I have a few questions about this setup.
The "primary key" in Elasticsearch is _id and _type, correct? So I'm going
to have to change my code to set _id to "client_id:id"? Or will ES allow
for the following two documents:
_id: 123
_type: "Type1"
client_id: "Client1"
_id: 123
_type: "Type1"
client_id: "Client2"
We're leaning towards not specifying the routing in the alias because we're
afraid of creating hotspots; we just want each "client" evenly distributed
across all shards, and will rely on adding nodes and increasing replication
to handle scaling of reads. Does that sound reasonable?
Now for the crummy part. Each of our client's documents will have
different fields. For example, we have a document type
"Application::Profile". For Client1, the fields might be [a, b, c], but
for Client2 the fields will be [d, e, f]. So I see two ways to solve this
problem:
- Define type "Application::Profile" to have fields that are a superset
of all the fields of all the clients. - Define different types for each client:
"Application::Profile/Client1", "Application::Profile/Client2"
Any suggestions? I don't really like either one of those solutions and am
considering just continuing with the idea of 1 index per client, but reduce
the number of shards per index to down to 1, then just adding nodes. This
still has issues though, like hotspots.
Thanks for the help.
--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.