UnavailableShardsException When Creating Index With Settings


(timscott) #1

In 0.16 when you create an index with settings, you can't PUT to it.
You get UnavailableShardsException.


(Shay Banon) #2

I am guessing you are starting a single node, in which case, this failure is expected. Since you create an index with 2 replicas per shard, on a single node, then there aren't enough shards available for write operation. See here: http://www.elasticsearch.org/guide/reference/api/index_.html (under write consistency).

If you use a single node, then creating an index with 0 replicas, or 1 replica, is good enough. You can always use the update settings API to increase the number of replicas if you add more nodes (or just let the shards rebalance themselves).
On Tuesday, April 26, 2011 at 2:18 AM, Tim Scott wrote:

In 0.16 when you create an index with settings, you can't PUT to it.
You get UnavailableShardsException.

https://gist.github.com/941463


(timscott) #3

Still trying to get my brain around shards and replicas.

Maybe you (or anyone) would be kind enough to offer opinions???

Here are the basics of my application. It's multi-tennant. It's just
getting started, and we have very few users. Each user has his own
index. I currently have one small EC2 server dedicated to ES. We
hope to grow to thousands and eventually hundreds of thousands of
users. No single user will be too big. The average user might grow
to say 5k - 50k documents after a while. Large users might have 100k
documents but rarely much more. No one user will have very heavy
usage. Documents will come in one at a time or in small batches.
Searches by any one user will be mostly occasional.

Does this set of facts point clearly to any shard/replica strategy?
Anything I didn't mention that would lead me in one way or another?
Thanks very much in advance.

On Apr 26, 12:50 pm, Shay Banon shay.ba...@elasticsearch.com wrote:

I am guessing you are starting a single node, in which case, this failure is expected. Since you create an index with 2 replicas per shard, on a single node, then there aren't enough shards available for write operation. See here:http://www.elasticsearch.org/guide/reference/api/index_.html(under write consistency).

If you use a single node, then creating an index with 0 replicas, or 1 replica, is good enough. You can always use the update settings API to increase the number of replicas if you add more nodes (or just let the shards rebalance themselves).

On Tuesday, April 26, 2011 at 2:18 AM, Tim Scott wrote:

In 0.16 when you create an index with settings, you can't PUT to it.
You get UnavailableShardsException.

https://gist.github.com/941463


(Shay Banon) #4

If you are going to have such a small number of users, then having 1 shard for an index is enough in terms of index capacity. Replicas are there to provide two functions, first, is high availability, the second is more scale when it comes to search. So, it up to you, 1 replica should be enough.

Going back to your design, in such a usecase I would go with a different design. I would create a single index with 10-20 shards, and 1 replica. I would use routing based on the username/userid to control indexing and search operation (when searching, also filter based on the user name) per user (faster). This is for a simple reason that a single index per user, even with a single shard, comes with an overhead. The proposed solution will result in much smaller overhead.

-shay.banon
On Saturday, April 30, 2011 at 11:53 PM, Tim Scott wrote:

Still trying to get my brain around shards and replicas.

Maybe you (or anyone) would be kind enough to offer opinions???

Here are the basics of my application. It's multi-tennant. It's just
getting started, and we have very few users. Each user has his own
index. I currently have one small EC2 server dedicated to ES. We
hope to grow to thousands and eventually hundreds of thousands of
users. No single user will be too big. The average user might grow
to say 5k - 50k documents after a while. Large users might have 100k
documents but rarely much more. No one user will have very heavy
usage. Documents will come in one at a time or in small batches.
Searches by any one user will be mostly occasional.

Does this set of facts point clearly to any shard/replica strategy?
Anything I didn't mention that would lead me in one way or another?
Thanks very much in advance.

On Apr 26, 12:50 pm, Shay Banon shay.ba...@elasticsearch.com wrote:

I am guessing you are starting a single node, in which case, this failure is expected. Since you create an index with 2 replicas per shard, on a single node, then there aren't enough shards available for write operation. See here:http://www.elasticsearch.org/guide/reference/api/index_.html(under write consistency).

If you use a single node, then creating an index with 0 replicas, or 1 replica, is good enough. You can always use the update settings API to increase the number of replicas if you add more nodes (or just let the shards rebalance themselves).

On Tuesday, April 26, 2011 at 2:18 AM, Tim Scott wrote:

In 0.16 when you create an index with settings, you can't PUT to it.
You get UnavailableShardsException.

https://gist.github.com/941463


(timscott) #5

Thanks very much for your thoughtful advice. One question. Each
tenant us is storing sensitive information, and it's absolutely
critical that one tenant never accesses another's information. I'm
not sure why, but it seems to me that index-per-tenant provides some
level of isolation that would enhance security. Is there anything to
that thought?

On May 1, 2:19 am, Shay Banon shay.ba...@elasticsearch.com wrote:

If you are going to have such a small number of users, then having 1 shard for an index is enough in terms of index capacity. Replicas are there to provide two functions, first, is high availability, the second is more scale when it comes to search. So, it up to you, 1 replica should be enough.

Going back to your design, in such a usecase I would go with a different design. I would create a single index with 10-20 shards, and 1 replica. I would use routing based on the username/userid to control indexing and search operation (when searching, also filter based on the user name) per user (faster). This is for a simple reason that a single index per user, even with a single shard, comes with an overhead. The proposed solution will result in much smaller overhead.

-shay.banon

On Saturday, April 30, 2011 at 11:53 PM, Tim Scott wrote:

Still trying to get my brain around shards and replicas.

Maybe you (or anyone) would be kind enough to offer opinions???

Here are the basics of my application. It's multi-tennant. It's just
getting started, and we have very few users. Each user has his own
index. I currently have one small EC2 server dedicated to ES. We
hope to grow to thousands and eventually hundreds of thousands of
users. No single user will be too big. The average user might grow
to say 5k - 50k documents after a while. Large users might have 100k
documents but rarely much more. No one user will have very heavy
usage. Documents will come in one at a time or in small batches.
Searches by any one user will be mostly occasional.

Does this set of facts point clearly to any shard/replica strategy?
Anything I didn't mention that would lead me in one way or another?
Thanks very much in advance.

On Apr 26, 12:50 pm, Shay Banon shay.ba...@elasticsearch.com wrote:

I am guessing you are starting a single node, in which case, this failure is expected. Since you create an index with 2 replicas per shard, on a single node, then there aren't enough shards available for write operation. See here:http://www.elasticsearch.org/guide/reference/api/index_.html(underwrite consistency).

If you use a single node, then creating an index with 0 replicas, or 1 replica, is good enough. You can always use the update settings API to increase the number of replicas if you add more nodes (or just let the shards rebalance themselves).

On Tuesday, April 26, 2011 at 2:18 AM, Tim Scott wrote:

In 0.16 when you create an index with settings, you can't PUT to it.
You get UnavailableShardsException.

https://gist.github.com/941463


(Shay Banon) #6

It does provide better isolation, but, if you make sure you filter based on the username each time you execute a search request, then it will be as safe.
On Monday, May 2, 2011 at 12:23 AM, Tim Scott wrote:

Thanks very much for your thoughtful advice. One question. Each
tenant us is storing sensitive information, and it's absolutely
critical that one tenant never accesses another's information. I'm
not sure why, but it seems to me that index-per-tenant provides some
level of isolation that would enhance security. Is there anything to
that thought?

On May 1, 2:19 am, Shay Banon shay.ba...@elasticsearch.com wrote:

If you are going to have such a small number of users, then having 1 shard for an index is enough in terms of index capacity. Replicas are there to provide two functions, first, is high availability, the second is more scale when it comes to search. So, it up to you, 1 replica should be enough.

Going back to your design, in such a usecase I would go with a different design. I would create a single index with 10-20 shards, and 1 replica. I would use routing based on the username/userid to control indexing and search operation (when searching, also filter based on the user name) per user (faster). This is for a simple reason that a single index per user, even with a single shard, comes with an overhead. The proposed solution will result in much smaller overhead.

-shay.banon

On Saturday, April 30, 2011 at 11:53 PM, Tim Scott wrote:

Still trying to get my brain around shards and replicas.

Maybe you (or anyone) would be kind enough to offer opinions???

Here are the basics of my application. It's multi-tennant. It's just
getting started, and we have very few users. Each user has his own
index. I currently have one small EC2 server dedicated to ES. We
hope to grow to thousands and eventually hundreds of thousands of
users. No single user will be too big. The average user might grow
to say 5k - 50k documents after a while. Large users might have 100k
documents but rarely much more. No one user will have very heavy
usage. Documents will come in one at a time or in small batches.
Searches by any one user will be mostly occasional.

Does this set of facts point clearly to any shard/replica strategy?
Anything I didn't mention that would lead me in one way or another?
Thanks very much in advance.

On Apr 26, 12:50 pm, Shay Banon shay.ba...@elasticsearch.com wrote:

I am guessing you are starting a single node, in which case, this failure is expected. Since you create an index with 2 replicas per shard, on a single node, then there aren't enough shards available for write operation. See here:http://www.elasticsearch.org/guide/reference/api/index_.html(underwrite consistency).

If you use a single node, then creating an index with 0 replicas, or 1 replica, is good enough. You can always use the update settings API to increase the number of replicas if you add more nodes (or just let the shards rebalance themselves).

On Tuesday, April 26, 2011 at 2:18 AM, Tim Scott wrote:

In 0.16 when you create an index with settings, you can't PUT to it.
You get UnavailableShardsException.

https://gist.github.com/941463


(system) #7