Best config for Performance


#1

Hello,
I have 1 dedicated server for Elasticsearch.
Here's the detailed informations

Elasticsearch

  • 4 cores; 16GB memory

I believe that i need to setup Graylog2 and Elasticsearch config.

Graylog2 config (/etc/default/graylog-server)

(/etc/graylog/server/server.conf)

elasticsearch_shards = 5
elasticsearch_replicas = 1

Elasticsearch Config (/etc/elasticsearch/jvm.options)

-Xms8g
-Xmx8g

What do you think guys ?
Could you give me different between elasticsearch_shards and replicas ? I am still confused reading the documentation
I hope you can give me the best suggestions from your experiences :slight_smile:
Thanks all


(Jake Landis) #2

With 1 server, I would suggest 1 shard and 0 replicas. The obvious downside of only 1 server is that if it fails you can loose data.

The 8g out of 16g seems about right.

Could you give me different between elasticsearch_shards and replicas ?

An index is composed of shards, and replicas are copies of those shards. Shards are how a single index can live across multiple nodes (e.g. if you had 5 servers, with 5 shards, each server would get 1 shard), since you only have 1 node it doesn't make much sense to have more then 1 shard. You also don't want copies for a single node since there is no where for the copy to go.


#3

Is it still recommended to plan ahead and set the shard size to something you know will not change? For example, I've set my shard count to 5 / 1 on a single node Elastic Stack.


(Jake Landis) #4

In that referenced thread Luca says:

You should therefore try to get the correct number of shards at index creation.

This is still true.

Say for example you have a typical logging use case with daily indexes, then index creation time is per-day. If you have 1 node then 5 shards are too many for that single node. Further for typical logging use cases have 30-90 retention, so for typical logging use cases it makes more sense to plan for today and if today needs more scale, add more nodes and re-plan for today.

On the other end of use cases, maybe you have index that is created once to hold inventory data. Today that inventory is small, but you expect it grow over time and there isn't a concept of daily indices or retention periods for this type of data. In those cases it may make sense to plan for the future, however the general advice is to use as few as shards as needed and if you get it wrong there is tooling to help re-size if needed.


#5

I would like to say thank you for your feedback.
Now, i understand difference between shards and replicas.
This is my server condition right now (htop)


Should i change the config to be [Shards = 1; Replicas = 0] ?
Can i change the memory size settings ?

-Xms11g
-Xmx11g
We have 5 GBs left

This is the graylog stats right now

What's your opinion guys ?


#6

@jakelandis What does replicas do exactly ?

You also don't want copies for a single node since there is no where for the copy to go.

Does it mean, it will make indices copy as many as indices for each nodes ?
For example, first nodes, there are some indices (Abc1,dEf2,GHi3). Will they all be in second node and third nodes until X nodes where X = replicas ?


(David Turner) #7

This is almost right, but in fact it's X+1 rather than X because the primary doesn't count as a replica. So if you set number_of_replicas:2 then the cluster will allocate up to three copies of each shard: one primary and two replicas.


#8

@DavidTurner Are Shards and Replicas config is only set in Master node only ?
If i have data node, should i set Shards and Replicas in data node too ?


(David Turner) #9

They are both properties of each index (i.e. different indices can have different values for these settings). You can specify them at index creation time, or in an index template, and you can change the number of replicas later through the API.


#10

@DavidTurner If i have 5 ES servers (2 Masters, 3 Data Nodes), so Do i use same value of Shards and Replicas in all servers ?


(David Turner) #11

I don't understand the question. You specify the number_of_shards and number_of_replicas settings when you create an index. It's not something that you set on any one node.


(system) #12

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.