It depends on the amount of data and how it looks like, plus the queries,
filters or facets. And also on how often indexing and search operations are
Indexing is usually very CPU and IO bound. But doing lots of faceting and
filtering will eat memory. Facets usually load fields and/or IDs in memory,
and filters are mostly cached. Then there are field caches - the more data
you have, the more memory you'll need. As you can see, most of the memory
usage is for performance reasons, and that's usually a good thing. And most
of them are configurable, especially the caches.
If I'd have to choose on memory vs CPU vs IO performance, I'd take a subset
of the production data on a small test cluster and do some performance
testing (again, with what I'd expect for production), while monitoring the
ES cluster. Then, I should have an idea of which resources are needed more,
but not before tuning the ES configuration to fit my needs.
http://sematext.com/ -- ElasticSearch -- Solr -- Lucene
On Tue, Dec 4, 2012 at 5:51 PM, Daniel Weitzenfeld
I'm curious about the CPU vs memory thing, because I was surprised when I
saw this high CPU, low MEM usage pattern for ES:
PID USER PR NI VIRT RES SHR S %CPU %MEM
26320 elastics 20 0 1565m 439m 3632 S 89.4 11.7
This is on an m1.medium.
Also, this is while I'm bombarding the thing with queries in order to warm
it up, maybe that's why...?
On Tuesday, December 4, 2012 6:25:58 AM UTC-5, Karel Minařík wrote:
If you allow me to be so free: would a 14 node cluster with just 8GB of
RAM be interesting (m1.large)? Or even a 28 node cluster with 4GB
Why would you prefer that to less stronger boxes? Financially, it's all
the same in AWS, you pay for resources used, not for instances per se.