I am new to Elasticsearch and we are planning on using it to capture all of our order flow for analytics purposes. I anticipate a data rate of about 2.1 to 3 mb/seconds with a total of 5 to 7 gb/day of data.
I want to purchase a dedicated pair of machines to run Elasticsearch on to keep it away from our core application servers but I was asked if it were possible to run the primary elastic instance on one of our existing core application servers as a cost savings measure.
I don't quite know how to determine if this is good idea or not.
The existing core app servers have 192 gb of memory on them and 32 cores with only about 11gb of memory used and a CPU utilization of 12% max.
It seems to me that there is room here but not sure if it's a good idea or not.
first, my math of 2MB/second does not add up to 7gb of data per day, but per hour - I suppose that's a peak load?
From my system architecture perspective it makes so much sense to keep Elasticsearch and your application separated - to not step on the toes of each other in case of higher load. Also it's easier to debug a performance issue, when only one service is running per machine (i.e. Page Cache thrashing will be tough to track). Utilization of that machine however sounds rather low, so that testing this on the same machine sounds like a better utilization than getting a new one.
That said, if you don't want to buy own hardware, what about Elastic Cloud and just consume the Elastic Stack as a Service? See Elastic Cloud: Hosted Elasticsearch, Hosted Search | Elastic
My math was definitely wrong.
5gb/day where 1 day is 6.5 hours =
12.8 mb/min =
Paying for cloud services isn't going to be popular due to the confidential nature of the data.
My solution is going to be to use our disaster recovery servers which are sitting around unused in another data center waiting for a disaster to happen. I have 3 high capacity servers sitting idle which can be used to form a cluster of 3 nodes instead of only 2.
In case of a disaster we would just shutdown the elastic cluster and failover over. After we fail back it would be business as usual. We can miss a few days (or weeks) of data without worry.
Thank you. I agree with you that the Elasticsearch engine should run on independent servers for the reasons you have given.
This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.