I am planning to deploy a cluster with 2 nodes. Following is the proposed architecture, I would like a community review.
Purpose: Non-enterprise. Final year project that will be collecting data from internet sensors (honeypots) for next 6 months.
Primary risk consideration: Downtime. Any downtime would break the chain of collection.
Proposed architecture: 2 Node cluster.
A a student I have limited compute resources. I have a single workstation that will server both nodes (I understand the downside of having single underlying hardware, however; given the situation I have to accept the risk.) I have a NAS which backups the VMs every 12 hours to minimise data recovery point objective.
Logstash pipeline is having IP of primary node. What are the steps to be taken when I have to take the primary node offline? How do I manage auto-switching of IPs?
Whenever I have to take a node offline, what are the precautions I have to take? (I may need to take the VM offline for security patching or tuning underlying OS for my project.)
Is giving non-master node lesser hardware OK? My primary need is ingestion and assimilation not multiple queries per seconds or minute. I will be lone user of the system and I will be querying large amount of data twice or thrice a week at maximum.
Configure Logstash to point to both nodes, instead of one.
** You only have 2 nodes, you should enable node.master in both nodes. If one node is down, the other node will be able to take over.
You may follow the rolling upgrade doc, but essentially, set to only allow primary shards allocation during maintenance.
Should be fine for testing purposes. It will depend on the amount of data you will be expecting to query/store in Elasticsearch. For reference, I am running a fairly small cluster which is a 3 nodes cluster that has 4 vCores 8GB RAM per node.
Not that I know of. Elasticsearch is a distributed system. We don't usually do that in a distributed system, which defeats the purpose imo.
Actually the recommended minimum viable cluster is 3 nodes, not 2, with the default minimum quorum of 2.
There is no primary/secondary concept in Elasticsearch (as no. 1 above). As long as your remaining node is able to handle all the load/data volume, for testing purposes, I wouldn't think too much about it.
Are following configurations OK. The reason I am spending time finalising is because I cannot take the system offline, since log collection will be real time and I cannot lose telemetry data.
Setting discovery hosts:
discovery.seed_hosts: ["host1", "host2"]
setting master nodes, do I need to change the order before taking node-1 offline?:
Before disconnecting a node, I plan to take put respective node out of shard allocation using:
My question here is: Is there an impact if I upgrade one node to next update of ELK stack (7.8.1 to whatever comes next? Or do I need to stop all ingestion and upgrade them together?
PUT _cluster/settings
{
"persistent": {
"cluster.routing.allocation.enable": "ip of node-1"
}
}
I was reading this documentation page: https://www.elastic.co/guide/en/elasticsearch/reference/current/high-availability-cluster-small-clusters.html and it states **"
A node with node.voting_only: true & other roles such as (data and master) being off - is this doable? and can this node be provisioned without the same storage (including IOPs) requirement of primary and secondary nodes? I want to make a 3 node cluster wherein two will storage data and provide HA while third one is only for tiebreaker
For upgrades, please refer to this rolling upgrades doc.
You will have to upgrade one node at a time if you want to keep your cluster online. See the above doc.
You don't need to change any configuration to bring down a node.
Yup, certainly, the node.voting_only: true config is to set up a tie-breaker node. You can set up a very light ES node as a tie-breaker node, because it can only vote to fulfill the quorum.
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.