What do you think guys ?
Could you give me different between elasticsearch_shards and replicas ? I am still confused reading the documentation
I hope you can give me the best suggestions from your experiences
Thanks all
With 1 server, I would suggest 1 shard and 0 replicas. The obvious downside of only 1 server is that if it fails you can loose data.
The 8g out of 16g seems about right.
Could you give me different between elasticsearch_shards and replicas ?
An index is composed of shards, and replicas are copies of those shards. Shards are how a single index can live across multiple nodes (e.g. if you had 5 servers, with 5 shards, each server would get 1 shard), since you only have 1 node it doesn't make much sense to have more then 1 shard. You also don't want copies for a single node since there is no where for the copy to go.
Is it still recommended to plan ahead and set the shard size to something you know will not change? For example, I've set my shard count to 5 / 1 on a single node Elastic Stack.
You should therefore try to get the correct number of shards at index creation.
This is still true.
Say for example you have a typical logging use case with daily indexes, then index creation time is per-day. If you have 1 node then 5 shards are too many for that single node. Further for typical logging use cases have 30-90 retention, so for typical logging use cases it makes more sense to plan for today and if today needs more scale, add more nodes and re-plan for today.
On the other end of use cases, maybe you have index that is created once to hold inventory data. Today that inventory is small, but you expect it grow over time and there isn't a concept of daily indices or retention periods for this type of data. In those cases it may make sense to plan for the future, however the general advice is to use as few as shards as needed and if you get it wrong there is tooling to help re-size if needed.
I would like to say thank you for your feedback.
Now, i understand difference between shards and replicas.
This is my server condition right now (htop)
Should i change the config to be [Shards = 1; Replicas = 0] ?
Can i change the memory size settings ?
You also don't want copies for a single node since there is no where for the copy to go.
Does it mean, it will make indices copy as many as indices for each nodes ?
For example, first nodes, there are some indices (Abc1,dEf2,GHi3). Will they all be in second node and third nodes until X nodes where X = replicas ?
This is almost right, but in fact it's X+1 rather than X because the primary doesn't count as a replica. So if you set number_of_replicas:2 then the cluster will allocate up to three copies of each shard: one primary and two replicas.
They are both properties of each index (i.e. different indices can have different values for these settings). You can specify them at index creation time, or in an index template, and you can change the number of replicas later through the API.
I don't understand the question. You specify the number_of_shards and number_of_replicas settings when you create an index. It's not something that you set on any one node.
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.