We are in the process of moving one of our datasets from an on-premise cluster to a self-managed cluster on AWS. We are planning on setting up some dedicated master nodes (something we have not done in the past) and we are considering allocating some dedicated query nodes as well. My question is twofold 1) are dedicated query nodes worth it? 2) . Should I direct my indexing (i.e. Logstash) to the query nodes or set up a separate load balancer to access the data nodes?
It depends. Mostly on the load you'll have.
A load balancer is not needed and I'd prefer using a coordinating node instead.
Did you also consider cloud.elastic.co? It runs on AWS (or GCP) and is fully managed by elastic. It has x-pack, automatic backups, ...
We are currently indexing around 22K doc/sec. Since we are using Logstash to transform the data, it does not appear that setting up ingest nodes would help. It sounds like setting up one or more coordinating nodes would be our best option.
I have not performed an in-depth eval of Elastic Cloud, but it appears that this would be an expensive option for us. We are indexing (with 1 replica) around 2TB of data per day.
It sounds like setting up one or more coordinating nodes would be our best option.
Probably a good choice. If you can afford it, start:
- 3 master only nodes
- x coordinating nodes
- y data nodes
Thanks David! I'll check out your presentation.
This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.