Few question about elasticsearch index and cluster

Robbie_Cheng · October 23, 2012, 10:51am

-can we add/remove specific fields dynamically, once the index has
been created?
-can we configure ratio of index in memory or in file?
-can we add new node into elasticsearch cluster when it is required?
any side effect?

--

Igor_Motov · October 23, 2012, 11:57am

You can add new fields easily, but they cannot be removed without
reindexing
Not sure what you mean by ratio. Elasticsearch provides some control over
individual cache sizeshttp://www.elasticsearch.org/guide/reference/index-modules/cache.html.
It also allows to allocate indices completely in memory or memory map your
indices using mmapfs index store.http://www.elasticsearch.org/guide/reference/index-modules/store.html
You can add and remove nodes from elasticsearch cluster. When a new node
is added to the cluster, elasticsearch rebalances its shards to make sure
that all nodes in the cluster have the same number of shards.

On Tuesday, October 23, 2012 6:51:19 AM UTC-4, Robbie Cheng wrote:

-can we add/remove specific fields dynamically, once the index has
been created?
-can we configure ratio of index in memory or in file?
-can we add new node into elasticsearch cluster when it is required?
any side effect?

--

radu_gheorghe · October 23, 2012, 12:12pm

Hello Robbie,

On Tue, Oct 23, 2012 at 1:51 PM, Robbie Cheng robbiecheng@gmail.com wrote:

-can we add/remove specific fields dynamically, once the index has
been created?

You can always add new fields to your mapping, but you can't remove
them. If you want to remove a field you will have to reindex your
data.

For more information about mapping, take a look here:

-can we configure ratio of index in memory or in file?

I'm not sure if I got the question right, but I don't think you can.
You can specify whether to store your index in memory or on the file
system, and there are some settings for each:

Given your question, I suppose your data doesn't fit in memory, so
you'd probably want to store your indices on the file system, and
leave it up to the OS to cache some of that in memory. Elasticsearch
also has its own caches: for example, many filters are cached by
default and you can change settings there. You can find some more
information about ES caching here:

-can we add new node into elasticsearch cluster when it is required?
any side effect?

Definitely. This is where sharding comes into play. In short, it goes like this:

each index is divided into shards. By default there are 5, but you
can change that for each index
each shard might have a number of replicas. Replicas are good for
redundancy and also query performance (since queries will also run on
replicas). By default there's 1 replica for each shard

So if you have one node and one index, by default you can see all your
5 shards allocated on your unique node. Replicas won't get allocated
in this scenario because it doesn't make sense.

If you add one more node, the 5 replicas will get assigned to it.

If you continue to add nodes, your 5 shards and 5 replicas (one per
shard) will be automatically balanced between nodes so that each node
will get roughly the same number of indices. For example, on 5 nodes
you'll get 2 shards for each (whether they are primary shards or
replicas).

So with one index, with the default configuration, you're good for 10
nodes. If you add an 11th node, it will get no shards. While you can
change the number of replicas on a live cluster, you can't change the
number of primary shards. So that's why you'd need to plan your number
of shards in production.

Please note that the logic above applies to the total number of shards
in your cluster. For all your indices. So, for example if you have 2
indices with the default configuration on a cluster, Elasticsearch can
distribute your data on up to 20 nodes.

Best regards,
Radu

http://sematext.com/ -- Elasticsearch -- Solr -- Lucene

--

Robbie_Cheng · November 22, 2012, 1:17am

Hey Radu,

Thanks for your detailed reply, one follow-up questions about the # of
primary shards since it can't be change afterwards?
What's the maxium number of primary shards, and is there any way that we
can determine how many of primary shards we need?

Thanks,

Radu Gheorghe於 2012年10月23日星期二UTC+8下午8時12分39秒寫道：

Hello Robbie,

On Tue, Oct 23, 2012 at 1:51 PM, Robbie Cheng <robbi...@gmail.com<javascript:>>
wrote:

-can we add/remove specific fields dynamically, once the index has
been created?

You can always add new fields to your mapping, but you can't remove
them. If you want to remove a field you will have to reindex your
data.

For more information about mapping, take a look here:
Elasticsearch Platform — Find real-time answers at scale | Elastic

-can we configure ratio of index in memory or in file?

I'm not sure if I got the question right, but I don't think you can.
You can specify whether to store your index in memory or on the file
system, and there are some settings for each:
Elasticsearch Platform — Find real-time answers at scale | Elastic

Given your question, I suppose your data doesn't fit in memory, so
you'd probably want to store your indices on the file system, and
leave it up to the OS to cache some of that in memory. Elasticsearch
also has its own caches: for example, many filters are cached by
default and you can change settings there. You can find some more
information about ES caching here:
Elasticsearch Platform — Find real-time answers at scale | Elastic

-can we add new node into elasticsearch cluster when it is required?
any side effect?

Definitely. This is where sharding comes into play. In short, it goes like
this:

each index is divided into shards. By default there are 5, but you
can change that for each index

each shard might have a number of replicas. Replicas are good for
redundancy and also query performance (since queries will also run on
replicas). By default there's 1 replica for each shard

So if you have one node and one index, by default you can see all your
5 shards allocated on your unique node. Replicas won't get allocated
in this scenario because it doesn't make sense.

If you add one more node, the 5 replicas will get assigned to it.

If you continue to add nodes, your 5 shards and 5 replicas (one per
shard) will be automatically balanced between nodes so that each node
will get roughly the same number of indices. For example, on 5 nodes
you'll get 2 shards for each (whether they are primary shards or
replicas).

So with one index, with the default configuration, you're good for 10
nodes. If you add an 11th node, it will get no shards. While you can
change the number of replicas on a live cluster, you can't change the
number of primary shards. So that's why you'd need to plan your number
of shards in production.

Please note that the logic above applies to the total number of shards
in your cluster. For all your indices. So, for example if you have 2
indices with the default configuration on a cluster, Elasticsearch can
distribute your data on up to 20 nodes.

Best regards,
Radu

http://sematext.com/ -- Elasticsearch -- Solr -- Lucene

--

radu_gheorghe · November 24, 2012, 10:16am

Hello Robbie,

On Thu, Nov 22, 2012 at 3:17 AM, Robbie Cheng robbiecheng@gmail.com wrote:

Hey Radu,

Thanks for your detailed reply, one follow-up questions about the # of
primary shards since it can't be change afterwards?
What's the maxium number of primary shards, and is there any way that we
can determine how many of primary shards we need?

If you're going to have a static number of indices, you need to think on
how many nodes you're going to split a single set of those indices (without
accounting replicas) in the log run, without having to reindex your data.
But you also have to account that each shard comes with an overhead, so you
can't just go with 1000 shards, "just to be sure".

Here's a very good video in which you can see some solutions on how you can
organize your indices and shards:

Best regards,
Radu

http://sematext.com/ -- Elasticsearch -- Solr -- Lucene

--

Topic		Replies	Views
On scaling Elasticsearch	10	900	July 6, 2017
Theoretical IMPORTANT questions about architecture of elasticsearch Elasticsearch	7	673	February 27, 2021
Increasing shards and then nodes Elasticsearch	12	943	July 6, 2017
Shard splitting and dynamic cluster expansion? Elasticsearch	3	388	July 6, 2017
Setting up elasticsearch to scale: shards per index Elasticsearch	9	554	July 6, 2017

Few question about elasticsearch index and cluster

Best regards, Radu

Best regards, Radu

Best regards, Radu

Related topics

Best regards,
Radu

Best regards,
Radu

Best regards,
Radu