2gb RAM means probably only 1gb of HEAP size which is really small for a production usage IMHO.
On AWS we often recommend to start with m1.xlarge to avoid noisy neighbors. So it comes with a default of 15 gb RAM, so 7 gb of heap.
--
David Pilato | Technical Advocate | Elasticsearch.com
@dadoonet | @elasticsearchfr
Le 16 décembre 2013 at 16:46:48, Chris Sham (cbs1918@gmail.com) a écrit:
Thank you for you feedback. I understand your concerns with my 'constraints'. I am trying to access what resources are necessary to meet the product owner's goal. Currently in our development environment when we start a new server we are given a virtual machine with 20gb of diskspace and 2gb of ram. Asking for a machine with 4gb of ram versus 64gb is a much easier task but I want to make sure that I am asking for an appropriate amount of resources to provide an environment where I can get fair results in testing. Is testing how much one machine will be able to handle with 2gb of ram sufficient or is there a recommended minimum amount I should start with to get a fair test?
On Monday, December 16, 2013 10:31:57 AM UTC-5, David Pilato wrote:
It's really hard to tell and it seems that you have many constraints here:
- budget "Our IT team felt this was very high and and cost prohibitive"
- history "Our product owner wants to be able to search logs for up to 3 years"
I feel like you will not be able to satisfy all these constraints.
I would start to test with a machine your IT team agrees to give you and start to inject as many logs as you can in this instance in a single index with a single shard and measure how long queries take.
At some point, you won't satisfy anymore your product owner's requirements. It's basically the number of documents a shard can contain.
Then add a new index on the same machine, add as many documents and see how search is running. If ok, add new index and so on…
This is the number of shards a single machine can contain given RAM, CPU you have.
Do you want replica? If so, you will need two machines at least because you will have two times more shards to manage.
Let's say you can now hold only 2 weeks of data. What's now? Are you going to ask for a bigger budget? Or are you going to relax requirements?
5gb per day is 5.5Tb of data for 3 years. Let's say it's somehow 11Tb with 1 replica.
You can start with less memory than 64gb. In that case, I think you will need more nodes.
Sorry if I'm not able to give you an actual number but really I think you need to test your different scenarios. The cool thing with the cloud is that you are able to scale out really easily and test that.
My 0.05 cents.
--
David Pilato | Technical Advocate | Elasticsearch.com
@dadoonet | @elasticsearchfr
Le 16 décembre 2013 at 16:12:14, Chris Sham (cbs...@gmail.com) a écrit:
What are the system requirements for running ElasticSearch? I've had a bit of trouble tracking down a specific number. In our dev/beta environments we start all machines on a standard VM build on Centos 6.4 with 2gb ram and 20gb diskspace. We have installed LogStash to ship logs to ElasticSearch. We had a couple of issues at one point on development with ElasticSearch where we were maxing out our CPU and then disk space was going away quickly. In looking into this issue we wanted to see if we had the appropriate resources allocated to the machine to begin with and then planning for the future we wanted some idea of what we should be running ElasticSearch on.
As I have tried to gather information through various web sources I keep hearing that ElasticSearch should be running with 64gb of ram per machine and we should have a minimum of 3 machines. Our IT team felt this was very high and and cost prohibitive, especially when we start deploying in the cloud. I wanted to follow up and check if this is really the case. If it is then I wanted to understand how long this resource would last before we needed more ram/machines.
I understand it's all about the numbers. We are still in development so I do not have hard numbers but can provide some close estimates based on data we are currently see. When we go live we anticipate collecting logs with a daily log size of 5gb-6gb in size from about 50 different servers. At this time our product owner wants to be able to search logs for up to 3 years. This is still something we are trying to evaluate and determine if there is a solution we can use to start offloading some of the data in increments and provide a secondary path to this data if needed. So ideally we would have 3 years of searchable logs. Ultimately I am trying to determine what resources we need to make ElasticSearch effective and if we can accommodate these requirements. Then from there determine volume and history of logs we would be able to store/search to see if this solution is going to work for our team.
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/2bfcfc79-c938-49dc-ad63-fa71649cb425%40googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.
--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/etPan.52af2211.737b8ddc.6956%40MacBook-Air-de-David.local.
For more options, visit https://groups.google.com/groups/opt_out.