Hmm. I think the problem I'll have then is that each customer has their own set of ids so the document's id won't necessarily be unique. Placing each customer in their own index was a simple way of dealing with this. Is there any out of the box way to address this as well, or should I just start using a compound key?
From: email@example.com [firstname.lastname@example.org] on behalf of Shay Banon [email@example.com]
Sent: Monday, August 08, 2011 12:09 PM
Subject: Re: # of shards vs. open files
Yes, each shard is a Lucene index, which requires its share of open files handles (and memory requirements and so on). You can go with a single index, and route based on user (its simpler to do that with 0.17, since you can associate an alias with the username, and an alias can have a filter (to filter results only for the relevant user), and a routing value (probably the username).
On Mon, Aug 8, 2011 at 9:52 PM, Javier Muniz <firstname.lastname@example.org:email@example.com> wrote:
Does the # of shards a node has impact the # of open files required? I am running a node that has more than 3000 shards because I have broken the data into a single-index-per-customer layout and I am finding myself running into the "too many open files" problem more and more. I just recently had to bump open files past 32000 (verified via -Des.max-open-files=true) in order to continue to add more customers to the node.
I guess my question is should I put these customers all into a single index to reduce the total # of shards, reduce the # of shards per index, or is my problem completely unrelated to sharding?