I did some intensive tests last week on a 20-node cluster and had the
following insights - I'd be interested if anyone has similar/dissimilar
experience.
The had 20 nodes had 8 cores each, and 32GB memory each. I set up
Elasticsearch to have 15GB of that memory.
The sample events I was using were Apache logs (common format) without any
additional fields (no geoip, useragent etc. plugins).
When running as a 20-node cluster, I got a maximum igestion rate of 2500k
events/minute (41k/second), but the bottleneck was the logstash CPU
load... so I reduced to a 10 node cluster...
With the 10 nodes I initially had 1600k/minute (27k) and acheived
1800k/minute (30k/second) by increasing index_refresh_interval to 30s and
index_buffer_size to 20%
Further reducing to 5 nodes, I had 1100k/minute (18k/second).
This brings me to an interesting comparison: at 10 nodes, I have 3k
events/second/node, and with 5 nodes I have 3.66k events/second/node. i.e.
the overhead for doubling the number of nodes from 5 to 10 is about 20%.
Is this to be expected? Just how scalable is Elasticsearch - at what point
is the diminishing return on adding nodes not cost effective?
Is the further logical reduction to 375 events/core/second still meaningful?
I did some intensive tests last week on a 20-node cluster and had the
following insights - I'd be interested if anyone has similar/dissimilar
experience.
The had 20 nodes had 8 cores each, and 32GB memory each. I set up
Elasticsearch to have 15GB of that memory.
The sample events I was using were Apache logs (common format) without any
additional fields (no geoip, useragent etc. plugins).
When running as a 20-node cluster, I got a maximum igestion rate of 2500k
events/minute (41k/second), but the bottleneck was the logstash CPU
load... so I reduced to a 10 node cluster...
With the 10 nodes I initially had 1600k/minute (27k) and acheived
1800k/minute (30k/second) by increasing index_refresh_interval to 30s and
index_buffer_size to 20%
Further reducing to 5 nodes, I had 1100k/minute (18k/second).
This brings me to an interesting comparison: at 10 nodes, I have 3k
events/second/node, and with 5 nodes I have 3.66k events/second/node. i.e.
the overhead for doubling the number of nodes from 5 to 10 is about 20%.
Is this to be expected? Just how scalable is Elasticsearch - at what
point is the diminishing return on adding nodes not cost effective?
Is the further logical reduction to 375 events/core/second still
meaningful?
Each node had 8 cores (2.4GHz Xeon), 32GB RAM, SSD disks (I never saw
IOWait, but was also focusing on ingestion rate).
I always had 2 master nodes, and in addition tried the configurations 20,
10 and 5 data nodes.
Running Elasticsearch 1.0.1 (but with Logstash 1.3.3)
I did some intensive tests last week on a 20-node cluster and had the
following insights - I'd be interested if anyone has similar/dissimilar
experience.
The had 20 nodes had 8 cores each, and 32GB memory each. I set up
Elasticsearch to have 15GB of that memory.
The sample events I was using were Apache logs (common format) without
any additional fields (no geoip, useragent etc. plugins).
When running as a 20-node cluster, I got a maximum igestion rate of 2500k
events/minute (41k/second), but the bottleneck was the logstash CPU
load... so I reduced to a 10 node cluster...
With the 10 nodes I initially had 1600k/minute (27k) and acheived
1800k/minute (30k/second) by increasing index_refresh_interval to 30s and
index_buffer_size to 20%
Further reducing to 5 nodes, I had 1100k/minute (18k/second).
This brings me to an interesting comparison: at 10 nodes, I have 3k
events/second/node, and with 5 nodes I have 3.66k events/second/node. i.e.
the overhead for doubling the number of nodes from 5 to 10 is about 20%.
Is this to be expected? Just how scalable is Elasticsearch - at what
point is the diminishing return on adding nodes not cost effective?
Is the further logical reduction to 375 events/core/second still
meaningful?
Cheers,
-Robin-
--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.
Each node had 8 cores (2.4GHz Xeon), 32GB RAM, SSD disks (I never saw
IOWait, but was also focusing on ingestion rate).
I always had 2 master nodes, and in addition tried the configurations 20,
10 and 5 data nodes.
Running Elasticsearch 1.0.1 (but with Logstash 1.3.3)
I did some intensive tests last week on a 20-node cluster and had the
following insights - I'd be interested if anyone has similar/dissimilar
experience.
The had 20 nodes had 8 cores each, and 32GB memory each. I set up
Elasticsearch to have 15GB of that memory.
The sample events I was using were Apache logs (common format) without
any additional fields (no geoip, useragent etc. plugins).
When running as a 20-node cluster, I got a maximum igestion rate of
2500k events/minute (41k/second), but the bottleneck was the logstash
CPU load... so I reduced to a 10 node cluster...
With the 10 nodes I initially had 1600k/minute (27k) and acheived
1800k/minute (30k/second) by increasing index_refresh_interval to 30s and
index_buffer_size to 20%
Further reducing to 5 nodes, I had 1100k/minute (18k/second).
This brings me to an interesting comparison: at 10 nodes, I have 3k
events/second/node, and with 5 nodes I have 3.66k events/second/node. i.e.
the overhead for doubling the number of nodes from 5 to 10 is about 20%.
Is this to be expected? Just how scalable is Elasticsearch - at what
point is the diminishing return on adding nodes not cost effective?
Is the further logical reduction to 375 events/core/second still
meaningful?
Cheers,
-Robin-
--
You received this message because you are subscribed to the Google
Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send
an email to elasticsearch+unsubscribe@googlegroups.com.
Each node had 8 cores (2.4GHz Xeon), 32GB RAM, SSD disks (I never saw
IOWait, but was also focusing on ingestion rate).
I always had 2 master nodes, and in addition tried the configurations 20,
10 and 5 data nodes.
Running Elasticsearch 1.0.1 (but with Logstash 1.3.3)
I did some intensive tests last week on a 20-node cluster and had the
following insights - I'd be interested if anyone has similar/dissimilar
experience.
The had 20 nodes had 8 cores each, and 32GB memory each. I set up
Elasticsearch to have 15GB of that memory.
The sample events I was using were Apache logs (common format) without
any additional fields (no geoip, useragent etc. plugins).
When running as a 20-node cluster, I got a maximum igestion rate of
2500k events/minute (41k/second), but the bottleneck was the
logstash CPU load... so I reduced to a 10 node cluster...
With the 10 nodes I initially had 1600k/minute (27k) and acheived
1800k/minute (30k/second) by increasing index_refresh_interval to 30s and
index_buffer_size to 20%
Further reducing to 5 nodes, I had 1100k/minute (18k/second).
This brings me to an interesting comparison: at 10 nodes, I have 3k
events/second/node, and with 5 nodes I have 3.66k events/second/node. i.e.
the overhead for doubling the number of nodes from 5 to 10 is about 20%.
Is this to be expected? Just how scalable is Elasticsearch - at what
point is the diminishing return on adding nodes not cost effective?
Is the further logical reduction to 375 events/core/second still
meaningful?
Cheers,
-Robin-
--
You received this message because you are subscribed to the Google
Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send
an email to elasticsearch+unsubscribe@googlegroups.com.
Each node had 8 cores (2.4GHz Xeon), 32GB RAM, SSD disks (I never saw
IOWait, but was also focusing on ingestion rate).
I always had 2 master nodes, and in addition tried the configurations
20, 10 and 5 data nodes.
Running Elasticsearch 1.0.1 (but with Logstash 1.3.3)
I did some intensive tests last week on a 20-node cluster and had
the following insights - I'd be interested if anyone has similar/dissimilar
experience.
The had 20 nodes had 8 cores each, and 32GB memory each. I set up
Elasticsearch to have 15GB of that memory.
The sample events I was using were Apache logs (common format) without
any additional fields (no geoip, useragent etc. plugins).
When running as a 20-node cluster, I got a maximum igestion rate of
2500k events/minute (41k/second), but the bottleneck was the
logstash CPU load... so I reduced to a 10 node cluster...
With the 10 nodes I initially had 1600k/minute (27k) and acheived
1800k/minute (30k/second) by increasing index_refresh_interval to 30s and
index_buffer_size to 20%
Further reducing to 5 nodes, I had 1100k/minute (18k/second).
This brings me to an interesting comparison: at 10 nodes, I have 3k
events/second/node, and with 5 nodes I have 3.66k events/second/node. i.e.
the overhead for doubling the number of nodes from 5 to 10 is about 20%.
Is this to be expected? Just how scalable is Elasticsearch - at what
point is the diminishing return on adding nodes not cost effective?
Is the further logical reduction to 375 events/core/second still
meaningful?
Cheers,
-Robin-
--
You received this message because you are subscribed to the Google
Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send
an email to elasticsearch+unsubscribe@googlegroups.com.
Sorry, correction, we were running ES and LS with java version "1.7.0_51",
I just looked up the default on that system, which (someone says) is
required for some other applications... but yes, we were using the correct
Java version!
Each node had 8 cores (2.4GHz Xeon), 32GB RAM, SSD disks (I never saw
IOWait, but was also focusing on ingestion rate).
I always had 2 master nodes, and in addition tried the configurations
20, 10 and 5 data nodes.
Running Elasticsearch 1.0.1 (but with Logstash 1.3.3)
I did some intensive tests last week on a 20-node cluster and had
the following insights - I'd be interested if anyone has similar/dissimilar
experience.
The had 20 nodes had 8 cores each, and 32GB memory each. I set up
Elasticsearch to have 15GB of that memory.
The sample events I was using were Apache logs (common format)
without any additional fields (no geoip, useragent etc. plugins).
When running as a 20-node cluster, I got a maximum igestion rate of
2500k events/minute (41k/second), but the bottleneck was the
logstash CPU load... so I reduced to a 10 node cluster...
With the 10 nodes I initially had 1600k/minute (27k) and acheived
1800k/minute (30k/second) by increasing index_refresh_interval to 30s and
index_buffer_size to 20%
Further reducing to 5 nodes, I had 1100k/minute (18k/second).
This brings me to an interesting comparison: at 10 nodes, I have 3k
events/second/node, and with 5 nodes I have 3.66k events/second/node. i.e.
the overhead for doubling the number of nodes from 5 to 10 is about 20%.
Is this to be expected? Just how scalable is Elasticsearch - at what
point is the diminishing return on adding nodes not cost effective?
Is the further logical reduction to 375 events/core/second still
meaningful?
Cheers,
-Robin-
--
You received this message because you are subscribed to the Google
Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it,
send an email to elasticsearch+unsubscribe@googlegroups.com.
AFAIK there is no 3.2. Debian went from 3.1 "woody" to 4.0 "etch".
(I have a computer still running, that was installed as debian potato in
2001, taken to debian testing, and went to woody as the first stable
release in 2002. Both woody or sarge were fine releases in their time,
but I wouldn't pick either for a new high performance cluster... )
AFAIK there is no 3.2. Debian went from 3.1 "woody" to 4.0 "etch".
(I have a computer still running, that was installed as debian potato in
2001, taken to debian testing, and went to woody as the first stable
release in 2002. Both woody or sarge were fine releases in their time,
but I wouldn't pick either for a new high performance cluster... )
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.