Hi,
I am working on a multi-tenant application and plan to use index alias routing to create and search data for a particular tenant.
I am planning to create dedicate index/alias for high traffic tenants and club low traffic tenants under few aliases.
As per my understanding, only one shard will be used for indexing and searches when I apply routing for creating and searching indexes.
But if only one shard (and I assume on a single node) is always used for high traffic tenant, it will easily grow to a large size. Will a large shard itself not become a bottleneck after sometime?
If yes, what are other ways to mitigate this problem?
One possible way can be to create more indexes for the same tenant and add all new indexes to the same alias. But then, what should be criteria to create a new index?
Hi,
I am working on a multi-tenant application and plan to use index alias routing to create and search data for a particular tenant.
I am planning to create dedicate index/alias for high traffic tenants and club low traffic tenants under few aliases.
As per my understanding, only one shard will be used for indexing and searches when I apply routing for creating and searching indexes.
But if only one shard (and I assume on a single node) is always used for high traffic tenant, it will easily grow to a large size. Will a large shard itself not become a bottleneck after sometime?
If yes, what are other ways to mitigate this problem?
One possible way can be to create more indexes for the same tenant and add all new indexes to the same alias. But then, what should be criteria to create a new index?
Thanks for the link. This presentation is very informative.
But I am still looking for my answer.
I understand that when we use alias routing, all data indexes/searches will happen through one shard.
I noticed that indexing becomes bit slower when we use only one shard. (I tested this using bulk indexing).
Now what steps can be taken to mitigate the issue of one "hot" shard?
Will creating new indexes and adding then to same alias work efficiently? I am also worried about the fact that there is always a maximum shard size and high traffic tenant may cross those limits.
Also, in Kimchy's presentation, I didn't understand one point in this slide - "users data flow - single index + routing".
It refers to large "overallocation". What is exactly "large overallocation" in the context of this slide?
Thanks
Ashish
On Jun 8, 2012, at 1:53 PM, Benjamin Devèze wrote:
Hi,
I am working on a multi-tenant application and plan to use index alias routing to create and search data for a particular tenant.
I am planning to create dedicate index/alias for high traffic tenants and club low traffic tenants under few aliases.
As per my understanding, only one shard will be used for indexing and searches when I apply routing for creating and searching indexes.
But if only one shard (and I assume on a single node) is always used for high traffic tenant, it will easily grow to a large size. Will a large shard itself not become a bottleneck after sometime?
If yes, what are other ways to mitigate this problem?
One possible way can be to create more indexes for the same tenant and add all new indexes to the same alias. But then, what should be criteria to create a new index?
On Saturday, June 9, 2012 6:19:11 PM UTC-4, Ashish Nigam wrote:
Thanks for the link. This presentation is very informative.
But I am still looking for my answer.
I understand that when we use alias routing, all data indexes/searches
will happen through one shard.
I noticed that indexing becomes bit slower when we use only one shard. (I
tested this using bulk indexing).
Now what steps can be taken to mitigate the issue of one "hot" shard?
Will creating new indexes and adding then to same alias work efficiently?
I am also worried about the fact that there is always a maximum shard size
and high traffic tenant may cross those limits.
Also, in Kimchy's presentation, I didn't understand one point in this
slide - "users data flow - single index + routing".
It refers to large "overallocation". What is exactly "large
overallocation" in the context of this slide?
Thanks
Ashish
On Jun 8, 2012, at 1:53 PM, Benjamin Devèze wrote:
Hi maybe not answering all your questions but have you looked at
kimchy latest presentation that address some points you are raising I
think:
Hi,
I am working on a multi-tenant application and plan to use index alias
routing to create and search data for a particular tenant.
I am planning to create dedicate index/alias for high traffic tenants
and club low traffic tenants under few aliases.
As per my understanding, only one shard will be used for indexing and
searches when I apply routing for creating and searching indexes.
But if only one shard (and I assume on a single node) is always used
for high traffic tenant, it will easily grow to a large size. Will a large
shard itself not become a bottleneck after sometime?
If yes, what are other ways to mitigate this problem?
One possible way can be to create more indexes for the same tenant and
add all new indexes to the same alias. But then, what should be criteria to
create a new index?
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.