I recently launched ELK and I'm receiving about 3,000,000 - 8,000,000 docs
per day (~ 5GB)
I'm running on AWS on a small server, and after a week of data collection
the system becomes very very slow, mainly when I am looking for data older
than 2 days.
Do you have a recommendation for servers in points such as cpu, memory and
iops and elstic settings like shards.
For Elasticsearch, try m3.xlarge and set ES_HEAP_SIZE to 7 or 8GB. You may
also want to have more than one node in your cluster.
You might also want to split Logstash off onto a separate instance. It is
CPU intensive but not particularly RAM intensive. Set the -w {n} flag in
the startup script to allow Logstash to run multiple threads across
multiple cores. You might start with a m3.large for this and use -w 2 and
see how it goes.
On Wednesday, August 13, 2014 9:38:10 AM UTC-6, AK wrote:
Hi,
I recently launched ELK and I'm receiving about 3,000,000 - 8,000,000 docs
per day (~ 5GB)
I'm running on AWS on a small server, and after a week of data collection
the system becomes very very slow, mainly when I am looking for data older
than 2 days.
Do you have a recommendation for servers in points such as cpu, memory and
iops and elstic settings like shards.
It's a little hard to make a recommendation like this because it really
depends on how you've structured your logical and physical index, how much
historical data you want to keep and query, what sort of queries you run,
how fast you need things to be, etc.
Something like SPM for Elasticsearch (Sematext Monitoring | Infrastructure Monitoring Service ) can tell
you about where your bottleneck is - maybe it's CPU, maybe it's RAM, maybe
it's IO, or something else. Based on that info you will see which
instances you should get, how many you'll need, etc.
On Wednesday, August 13, 2014 5:38:10 PM UTC+2, AK wrote:
Hi,
I recently launched ELK and I'm receiving about 3,000,000 - 8,000,000 docs
per day (~ 5GB)
I'm running on AWS on a small server, and after a week of data collection
the system becomes very very slow, mainly when I am looking for data older
than 2 days.
Do you have a recommendation for servers in points such as cpu, memory and
iops and elstic settings like shards.
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.