We have been trying to move to Elasticsearch implementation from core Lucene and have stumbled into a couple of situations. Before asking questions relating to our complex situations, we would like to share detail information of our Current System-Data-Infrastructure and Our Main Target as below:
Our System-Data-Infrastructure
a. Currently, we are hosting the Lucene 3.1 on single Standalone Server without replication
b. Data-Size: 5 TB minimum to 15 TB maximum or even increasing
c. Total Doc Count: Atleast 50 Million and increasing
d. New Doc Indexing/Updating Time Interval: Daily
e. Full Re-indexing Time Interval: Once a month
f. Planned Hosting Region: US-East (Ohio)
g Usage Up/Down Time: 24hrs Up
h. Traffic Hits: Moderately high
Our Main Target:
- Migrate to Elasticsearch with replication
- We have built our own custom plugin which needs to be integrated that works based on Lucene/Elastic Queries
Our Questions relating to our Complex Situations:
-
If we move to Elasticsearch, what could be a possible best infrastructure configuration i.e.
A. How many Master Nodes and Data Nodes combination will be required to hold the above data-size in an optimized way without compromising the search speed?
B. What is the best-optimized Instance Type (with Best No. or Cores/Memory size) to be used for both Master Nodes and Data Nodes to determine?
C. Does implementing more no. of Nodes actually mean load-balancing the Elasticsearch or is there other way for load-balancing the Elasticsearch? -
We have found out that we can implement Elasticsearch through the following vendors:
A. Elasticsearch Service by Elastic Cloud
a. If we use this, can we install our custom plugin and how? Also, whether we “own” the installation, or it is perpetually under Elastic Cloud's control?
b. Do Elastic Cloud solutions include load balancing for the Elasticsearch automatically (or the load balancing solely based on the no. of Nodes)?
c. If we deploy our application on the Cloud (for eg.: Google Cloud or AWS or Azure), how can we manage the VPC and its security while connecting with the Elasticsearch by Elastic Cloud?
d. If we deploy our application on the Cloud (for eg.: Google Cloud or AWS or Azure), won't there be a charge of data transfer by these Cloud services?
e. Is there any package plan for One-Year or 3-Year Term contract?
f. What is the level of Technical Support provided by Elastic Cloud?
B. Elasticsearch Service by Elastic Cloud on the AWS Marketplace
a. If we use from the Marketplace, can we install our custom plugin and how? Also, whether we “own” the installation, or it is perpetually under Elastic Cloud's control?
b. Do Elastic Cloud solutions include load balancing for the Elasticsearch automatically (or the load balancing solely based on the no. of Nodes)?
c. If we deploy our application on the Cloud (for eg.: Google Cloud or AWS or Azure), how can we manage the VPC and its security while connecting with the Elasticsearch by Elastic Cloud from the Marketplace?
d. If we deploy our application on the Cloud (for eg.: Google Cloud or AWS or Azure), won't there be a charge of data transfer by these Cloud services?
e. Is there any package plan for One-Year or 3-Year Term contract if we choose from the Marketplace?
f. What is the level of Technical Support provided by for Elasticsearch by Elastic Cloud from the Marketplace?
C. AWS Elasticsearch Service
Since they do not provide a provision to install the custom plugin, we do not think this service suits us anymore.