And make sure you have enough memory allocated to the nodes as well. You
can check how much memory is being occupied by the field data cache by using
the node stats api.
On Thursday, January 20, 2011 at 6:38 PM, Shay Banon wrote:
As long as you plan to add more capacity when it comes to number of
servers/nodes.
On Thursday, January 20, 2011 at 6:36 PM, Lee Parker wrote:
We do regularly sort the results using a field we call date which contains
a unix timestamp as an integer. we are already experiencing slow results
from a filtered and sorted query. If we currently have about 60G of data,
which we do filtered and sorted queries against regularly, should we plan to
use a greater number of shards?
Lee
"It doesn't matter whether you are liberal or conservative, but it's
dangerous to always think with exclamation points instead of question
marks."
by Marty Beckerman
On Thu, Jan 20, 2011 at 10:18 AM, Shay Banon <shay.banon@elasticsearch.com
wrote:
It really depends what you plan to do with it. They can certainly be
bigger than 10gb, if you use things like facets or sorting, the values of
the field per doc will be loaded to memory, which means that you might need
more memory as the shard doc count grows, but that is linear growth, that
can be easily derived from some capacity planning down on a smaller size.
-shay.banon
On Thursday, January 20, 2011 at 4:01 PM, Lee Parker wrote:
It was my understanding based on some other threads that performance will
degrade once shards are greater than 10g. If that is not true then I will
just stick with the 5 shards and 1 replica setup.
Lee
"It doesn't matter whether you are liberal or conservative, but it's
dangerous to always think with exclamation points instead of question
marks."
by Marty Beckerman
On Wed, Jan 19, 2011 at 1:22 PM, Shay Banon shay.banon@elasticsearch.comwrote:
The 5 shards default allow you to grow up to 5 nodes in terms of index
capacity, and in terms of improving search capacity, you can simply
dynamically add more replicas to the cluster. So, with a 5 shard 1 replica
option, you will max out on 10 nodes (nothing will be left to be allocated
to the 11th node if started).
Sounds like you are planning on growing from 1-2 nodes to 3, this is still
well below the capacity provided by 5 shards and 1 replica. A 12gb index is
not a big one.
-shay.banon
On Wednesday, January 19, 2011 at 7:39 PM, Clinton Gormley wrote:
Hi Lee
In that case, here is my plan for increasing my shard count:
- Spin up a new ES server but make sure it doesn't join the existing
server as a node.
- Pull the list of documents from my data and then get the documents
from the existing ES server and put them on the new one.
- Once all the data is in the new ES server, shutdown the older one
wipe its data and then start it up as a new node to join the new ES
server.
Alternatively, instead of messing with a new server, you could:
- create a new index 'new_index_timestamp' on the same ES server
- index from 'my_index' to 'new_index_timestamp'
- delete 'my_index'
- create alias 'my_index' pointing to 'new_index_timestamp'
The benefit of switching to using aliases is that it will make it easier
to make changes to your index in the future.
clint