Here's my simplified use case: three indexes, each has 5 billion documents. No need to hit multiple indexes in search. The length of time to index the data is in concern here. With three server nodes, I can have:
(1) single Elasticsearch cluster with 3 nodes. Indexing in three threads
(2) three Elasticsearch clusters, each has its own server node.
My experiment shows (1) is a lot slower than (2) to index all data, about 50% slower. I hope to hear others share their experience on this -- do you see similar performance in (1), or maybe something (configuration) I didn't get right?
Is the hardware the same here between the 3 scenarios? Is the only difference that you in one case have 3 clustered nodes and in the other you instead have 3 single-node clusters?
What type of hardware are you deploying on?
yes the hardware is all the same, ec2 server (16 vcpu, 128 GB). Thanks
Are you using one instance per host or do you host all 3 nodes on a single host? Which instance type are you using?
This sounds very low. How are you indexing into Elasticsearch? What bulk size are you using? What is the average size of your documents?
This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.