Elasticsearch two nodes config

Hi everyone;
I have a question for high availability
System Info
-5 different logs
-Data size ~75 GB per daily
-Total servers 2 nodes (1 master and node)
-Daily Indices
-ı use xms and xmx 8GB
-Master = ram 56 GB 800 mhz and CPU 2.61 GHZ 12 cores 2 sockets
-Slave = ram 24 GB 1333 mhz and CPU 2.61 GHZ 12 cores 2 sockets

Problems
-ı want to look at monthly visual reports for daily indices but Master CPU is getting %99 and Kibana dashboard return visualization timeout
-is it possible to create high availability using two nodes

  • Does writting to two different files cause data loss? (Example two different destination path data adress)
    -do you have any advice?

Elk settings

MASTER

bootstrap.memory_lock: true
cluster.name: elasticsearch
http.port: 9200
node.data: true
node.ingest: false
node.master: true
node.max_local_storage_nodes: 2
node.name: 172.28.25.235
path.data: \\172.28.1.1\Elastic\Elasticsearch\data
path.logs: C:\ProgramData\Elastic\Elasticsearch\logs
transport.tcp.port: 9300
xpack.license.self_generated.type: basic
xpack.security.enabled: false
discovery.zen.minimum_master_nodes: 1
discovery.zen.ping.unicast.hosts: ["172.28.25.235","172.28.26.169"]
network.host: 172.28.25.235

SLAVE

bootstrap.memory_lock: true
cluster.name: elasticsearch
http.port: 9200
node.data: true
node.ingest: true
node.master: false
node.max_local_storage_nodes: 1
node.name: 172.28.26.169
path.data: \\172.28.1.1\Elastic\Elasticsearch\data2
path.logs: C:\ProgramData\Elastic\Elasticsearch\logs
transport.tcp.port: 9300
xpack.license.self_generated.type: basic
xpack.security.enabled: false
discovery.zen.minimum_master_nodes: 1
discovery.zen.ping.unicast.hosts: ["172.28.25.235","172.28.26.169"]
network.host: 172.28.26.169

Logstash

Logstash
output {
  elasticsearch {
    hosts => ["172.28.25.235:9200","172.28.26.169:9200"]
	index => "%{es_index}"
	manage_template => false
  }
}

// http://172.28.25.235:9200/_cluster/health

{
  "cluster_name": "OriginMaster",
  "status": "yellow",
  "timed_out": false,
  "number_of_nodes": 2,
  "number_of_data_nodes": 2,
  "active_primary_shards": 101,
  "active_shards": 163,
  "relocating_shards": 0,
  "initializing_shards": 2,
  "unassigned_shards": 37,
  "delayed_unassigned_shards": 0,
  "number_of_pending_tasks": 0,
  "number_of_in_flight_fetch": 0,
  "task_max_waiting_in_queue_millis": 0,
  "active_shards_percent_as_number": 80.6930693069307
}

and ı have "/_cluster/stats" attachment additional information

Hi @khergner,

Possibly your dashboard is making unreasonably complicated queries. I'm not the best person to help you with this, but I can answer the other questions:

No, you need at least three master-eligible nodes for a fault-tolerant cluster.

I don't understand the question. What is a "destination path data address"?

Hi David,
Thank you for your reply but I'm so confused because how can ı use effective two nodes? as I understand we want to high availabilty so we need 6 nodes ( 3 nodes Master Nodes and 3 nodes Data nodes) Have ı got it right?

Destination data adress: As ı mentioned below
For Master; \172.28.1.1\Elastic\Elasticsearch\data
For Slave; \172.28.1.1\Elastic\Elasticsearch\data2

Regards.

No, the absolute minimum is three nodes because each master-eligible node can also act as a data node. You may need more nodes for performance reasons: using 3 dedicated master nodes and some number of dedicated data nodes can be a good idea in larger or more heavily loaded clusters.

I think I see. It's recommended to give each node its own data path like this. Elasticsearch makes sure not to lose any data when replicating it across different nodes. However these paths look like they are network-attached storage and not local disks. We recommend using local disks, since network-attached storage often performs poorly and is easy to configure wrongly putting your data at risk of corruption.

Sorry, missed a question:

Using two nodes doesn't give you fault tolerance, but it still allows you to make use of all the resources of those two nodes: disk space, IO bandwidth, RAM, CPU, and so on.

Now ı can understand, ı was doing research this topic. ı find below article I have last question. Should ı use multiple or single Logstash service. My approach ı use single logstash this aprroach is it correct?

Is it necessary to use both as a master and data ?

I can't comment on your Logstash setup, sorry, but maybe you could ask in the Logstash forum?

I'd normally say to use one master-eligible node not 2. If only one of the two nodes is master-eligible then you can at least tolerate the failure of the other one.

Hi David
Thank you for reply. Now ı get it clear
Regards

1 Like

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.