Elastic Cloud Architecture Selection and System Requirements Suggestion

Hey there,

I just tried Elastic Cloud service and looks like it meets my expectations. I want to ask some questions before further steps.

I've got approximately 600GB of data currently, which strictly defined as "keyword" mapping for all of the fields. My data may be grow up to 1.4TB -even may be more in the future- and needed to be actively searchable in all indices. I can't decide some questions on my own. So I need suggestions about;

1- I liked the "Hot-Warm Architecture" with high storage offering. But I think, I need an architecture with no-warm nodes. As I said before, I need to actively search inside all indices with high search speed. Should I choose "I/O Optimized Architecture" instead of this?

2- If I need further storage in the future, can I easily increase the size of nodes or append new nodes without a need of extra configuration? Is Elastic Cloud supports distribution of data with the old nodes and the new? Like automatic reorganizing?

3- I really don't understood about "Fault Tolerance". Is that means, creating a new data node for replica shard usage only? Or, synchronizes my primary shards on my current data node with another machine in another zone - If your current data machine went down, it automatically serve you from our backup machine in another zone - status?


Note: I previously open a topic inside 'Elasticsearch' with same text. But after I found this section, I thought that my topic is more appropriate for here.Flagged old topic by, "Content on wrong section" already, but didn't get any response to this. So, I reopened it again here.

This forum focuses on the infrastructure ("ECE") that we use to power Elastic Cloud ("ESS" for Elasticsearch Service) (but can also be downloaded to use as a standalone "cloud")

Since the answers apply to ECE as well, I'll respond (note that general support for ESS is available in https://cloud.elastic.co/help:

1] For general purpose applications we recommend high-IO (default with SSD on ECE). Hot-warm is really for logging use cases where you're going to have loads of data that you don't search very often

2] On both ECE and ESS, to increase the capacity, you just up the slider which will either up the capacity of the node or add a new node (currently on ESS it will keep upping the node size until it hits 64GB then add new 64GB nodes; on ECE you have more control over node capacity increase vs additional nodes)

The capacity increase is non-interruptive (provided you have fault tolerance enabled, see below) and data will automatically rebalance across any new nodes (in fact just using the configurable ES settings for rebalancing)

3] Fault-tolerance means: for a given size you select on the slider in the UI (eg 16GB, 64GB etc), you get that amount of capacity in either 2 zones or 3 zones (depending on which you pick).

So for example, if you select 32GB capacity and 2 zones, then you will get one 32GB instance in zone1 (eg us-east-1a), another 32GB instance in zone2 (and then a small 1GB "no data tiebreaker" in zone3)

We then use the ES feature "zone awareness" to ensure that (under normal operation) each zone gets one replica of each shard (this assumes you configure the replicas per index correctly, eg using the default of 1 for 2HA, setting to 2 for 3HA).

This means that if one zone goes down (or more usually one of the servers inside the zone), the data will remain available while our infrastructure automatically brings up a new server and moves your dead instance to that (where it then reloads its data via Elasticsearch replication).

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.