Scaling to 150k/sec

I currently have a proof of concept cluster handling about 12000 msgs/sec. If a certain project kicks off, I would need to scale 10+ times. Is anyone successfully running ES at over 150k/sec? What kind of shard layout are you using, how many data nodes, etc?

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/7e31b4346a994d20939c349fb3c002fd%40BN1PR07MB039.namprd07.prod.outlook.com.
For more options, visit https://groups.google.com/d/optout.

1 Like

We've gotten a cluster up to 40K/sec using roughly 40 nodes. We're going
to switch over to using dedicated master nodes as well. This is all in AWS.

-Justin

On Sunday, March 16, 2014 9:14:43 PM UTC-4, Janet Sullivan wrote:

I currently have a proof of concept cluster handling about 12000
msgs/sec. If a certain project kicks off, I would need to scale 10+
times. Is anyone successfully running ES at over 150k/sec? What kind of
shard layout are you using, how many data nodes, etc?

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/f6c889f0-5288-4aef-b2c8-007b9272b94a%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

We are getting ~15K/s with 12 data + 3 master nodes, latest version of java
and ES.

Some things to try would be;

  • Java 8
  • G1GC
  • SSD storage
  • tweaking various index and pool caches
  • optimising your data inputs, mappings etc

We're trialling the first two of those on our dev cluster, but it doesn't
do much traffic so I cannot empirically comment on it's capabilities at the
levels you're after as yet.

What does your current setup (ie infrastructure) look like now?

Regards,
Mark Walkom

Infrastructure Engineer
Campaign Monitor
email: markw@campaignmonitor.com
web: www.campaignmonitor.com

On 17 March 2014 13:17, jplock@gmail.com wrote:

We've gotten a cluster up to 40K/sec using roughly 40 nodes. We're going
to switch over to using dedicated master nodes as well. This is all in AWS.

-Justin

On Sunday, March 16, 2014 9:14:43 PM UTC-4, Janet Sullivan wrote:

I currently have a proof of concept cluster handling about 12000
msgs/sec. If a certain project kicks off, I would need to scale 10+
times. Is anyone successfully running ES at over 150k/sec? What kind of
shard layout are you using, how many data nodes, etc?

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/f6c889f0-5288-4aef-b2c8-007b9272b94a%40googlegroups.comhttps://groups.google.com/d/msgid/elasticsearch/f6c889f0-5288-4aef-b2c8-007b9272b94a%40googlegroups.com?utm_medium=email&utm_source=footer
.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAEM624Z4iTt0cSKZG0ton9tZLuhBWP9WDBj6eJ7Xi%2B1jonniDg%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

We haven't tried Java 8 or changing garbage collection yet -- I've heard
mixed results on GC. We're using SSD storage on Azure, and have a fairly
tweaked out config. I was thinking about turning off any kind of analysis
in the template to get it to scale...

On Sunday, March 16, 2014 7:37:09 PM UTC-7, Mark Walkom wrote:

We are getting ~15K/s with 12 data + 3 master nodes, latest version of
java and ES.

Some things to try would be;

  • Java 8
  • G1GC
  • SSD storage
  • tweaking various index and pool caches
  • optimising your data inputs, mappings etc

We're trialling the first two of those on our dev cluster, but it doesn't
do much traffic so I cannot empirically comment on it's capabilities at the
levels you're after as yet.

What does your current setup (ie infrastructure) look like now?

Regards,
Mark Walkom

Infrastructure Engineer
Campaign Monitor
email: ma...@campaignmonitor.com <javascript:>
web: www.campaignmonitor.com

On 17 March 2014 13:17, <jpl...@gmail.com <javascript:>> wrote:

We've gotten a cluster up to 40K/sec using roughly 40 nodes. We're going
to switch over to using dedicated master nodes as well. This is all in AWS.

-Justin

On Sunday, March 16, 2014 9:14:43 PM UTC-4, Janet Sullivan wrote:

I currently have a proof of concept cluster handling about 12000
msgs/sec. If a certain project kicks off, I would need to scale 10+
times. Is anyone successfully running ES at over 150k/sec? What kind of
shard layout are you using, how many data nodes, etc?

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearc...@googlegroups.com <javascript:>.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/f6c889f0-5288-4aef-b2c8-007b9272b94a%40googlegroups.comhttps://groups.google.com/d/msgid/elasticsearch/f6c889f0-5288-4aef-b2c8-007b9272b94a%40googlegroups.com?utm_medium=email&utm_source=footer
.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/1f430f4b-369f-4db7-bb00-669455ca63f3%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

There's a bunch of kernel/OS tweaks you can apply as well. eg noatime,
nodiratime if you mount the ES dir on it's own, or some of these
http://namhuy.net/1563/how-to-tweak-and-optimize-ssd-for-ubuntu-linux-mint.html
Then indices.store.throttle.max_bytes_per_sec might be worth looking at,
you can increase that from the 20mb default.

I know there'd be a bunch of people that would be interested in your setup
and the tweaks you've done if you're interested in putting up a blog
post/gist with it.

Regards,
Mark Walkom

Infrastructure Engineer
Campaign Monitor
email: markw@campaignmonitor.com
web: www.campaignmonitor.com

On 18 March 2014 07:43, johnar@microsoft.com wrote:

We haven't tried Java 8 or changing garbage collection yet -- I've heard
mixed results on GC. We're using SSD storage on Azure, and have a fairly
tweaked out config. I was thinking about turning off any kind of analysis
in the template to get it to scale...

On Sunday, March 16, 2014 7:37:09 PM UTC-7, Mark Walkom wrote:

We are getting ~15K/s with 12 data + 3 master nodes, latest version of
java and ES.

Some things to try would be;

  • Java 8
  • G1GC
  • SSD storage
  • tweaking various index and pool caches
  • optimising your data inputs, mappings etc

We're trialling the first two of those on our dev cluster, but it doesn't
do much traffic so I cannot empirically comment on it's capabilities at the
levels you're after as yet.

What does your current setup (ie infrastructure) look like now?

Regards,
Mark Walkom

Infrastructure Engineer
Campaign Monitor
email: ma...@campaignmonitor.com
web: www.campaignmonitor.com

On 17 March 2014 13:17, jpl...@gmail.com wrote:

We've gotten a cluster up to 40K/sec using roughly 40 nodes. We're
going to switch over to using dedicated master nodes as well. This is all
in AWS.

-Justin

On Sunday, March 16, 2014 9:14:43 PM UTC-4, Janet Sullivan wrote:

I currently have a proof of concept cluster handling about 12000
msgs/sec. If a certain project kicks off, I would need to scale 10+
times. Is anyone successfully running ES at over 150k/sec? What kind of
shard layout are you using, how many data nodes, etc?

--
You received this message because you are subscribed to the Google
Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send
an email to elasticsearc...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/
msgid/elasticsearch/f6c889f0-5288-4aef-b2c8-007b9272b94a%
40googlegroups.comhttps://groups.google.com/d/msgid/elasticsearch/f6c889f0-5288-4aef-b2c8-007b9272b94a%40googlegroups.com?utm_medium=email&utm_source=footer
.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/1f430f4b-369f-4db7-bb00-669455ca63f3%40googlegroups.comhttps://groups.google.com/d/msgid/elasticsearch/1f430f4b-369f-4db7-bb00-669455ca63f3%40googlegroups.com?utm_medium=email&utm_source=footer
.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAEM624ZrQTpgew8ttC2SHKyZv_dLoyp5%2BGToHC8mr4NmH8SosQ%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

I got 25K per second using 3 nodes 32G EC2 system.

I am also interested in something like 150k/sec of indexing speed, and need
help.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/ba9799e0-82e8-4337-bd11-e3fbdd9a5f4f%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

It would be helpful to see the document size along with the parameters.
On Mon, Mar 17, 2014 at 2:27 PM, kumar.soumitra@gmail.com wrote:

I got 25K per second using 3 nodes 32G EC2 system.

I am also interested in something like 150k/sec of indexing speed, and
need help.

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/ba9799e0-82e8-4337-bd11-e3fbdd9a5f4f%40googlegroups.comhttps://groups.google.com/d/msgid/elasticsearch/ba9799e0-82e8-4337-bd11-e3fbdd9a5f4f%40googlegroups.com?utm_medium=email&utm_source=footer
.

For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAOT3TWq6RfPtWRTWZnwr_sodAhAPO0Ptbuo6g7Nz0uSAwUWNLg%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

We have a bunch of different things going into elasticsearch, mostly
network-related telemetry. The latest painful one is IPFIX (netflow). The
aggregation we're working on looks like this:

{
"_index": "ipfix-2014.03.17",
"_type": "logs",
"_id": "oA5uhOc9QpCaH9iGPlY7AQ",
"_score": null,
"_source": {
"peer_ip_src": "207.46.32.122",
"ip_dst": "157.56.106.160",
"as_src": 12670,
"mask_dst": 27,
"as_path": "",
"ip_src": "92.102.0.0",
"bytes": 80,
"port_dst": 3544,
"mask_src": 16,
"stamp_inserted": "2014-03-17 09:18:00",
"stamp_updated": "2014-03-17 10:31:28",
"packets": 1,
"@version": "1",
"@timestamp": "2014-03-17T12:19:27.244Z",
"source": "ipfix"
},
"sort": [
1395058767244
]
}

There's just a really stupid high volume of really small messages that need
little in the way of analysis (which we need to turn off).

On Monday, March 17, 2014 2:45:04 PM UTC-7, Mo wrote:

It would be helpful to see the document size along with the parameters.
On Mon, Mar 17, 2014 at 2:27 PM, <kumar.s...@gmail.com <javascript:>>wrote:

I got 25K per second using 3 nodes 32G EC2 system.

I am also interested in something like 150k/sec of indexing speed, and
need help.

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearc...@googlegroups.com <javascript:>.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/ba9799e0-82e8-4337-bd11-e3fbdd9a5f4f%40googlegroups.comhttps://groups.google.com/d/msgid/elasticsearch/ba9799e0-82e8-4337-bd11-e3fbdd9a5f4f%40googlegroups.com?utm_medium=email&utm_source=footer
.

For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/ec3c13be-7194-404a-b962-1a3c7612fc07%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Hi,

These are nice and small and require no analysis. Turn of _all, tweak
merge rate, use high refresh interval, give ES/Lucene a good buffer, look
at xa log flush settings, etc. and you should be able to get to 150K/sec
without requiring dozens of servers.

Otis

Performance Monitoring * Log Analytics * Search Analytics
Solr & Elasticsearch Support * http://sematext.com/

On Monday, March 17, 2014 11:10:16 PM UTC-4, joh...@microsoft.com wrote:

We have a bunch of different things going into elasticsearch, mostly
network-related telemetry. The latest painful one is IPFIX (netflow). The
aggregation we're working on looks like this:

{
"_index": "ipfix-2014.03.17",
"_type": "logs",
"_id": "oA5uhOc9QpCaH9iGPlY7AQ",
"_score": null,
"_source": {
"peer_ip_src": "207.46.32.122",
"ip_dst": "157.56.106.160",
"as_src": 12670,
"mask_dst": 27,
"as_path": "",
"ip_src": "92.102.0.0",
"bytes": 80,
"port_dst": 3544,
"mask_src": 16,
"stamp_inserted": "2014-03-17 09:18:00",
"stamp_updated": "2014-03-17 10:31:28",
"packets": 1,
"@version": "1",
"@timestamp": "2014-03-17T12:19:27.244Z",
"source": "ipfix"
},
"sort": [
1395058767244
]
}

There's just a really stupid high volume of really small messages that
need little in the way of analysis (which we need to turn off).

On Monday, March 17, 2014 2:45:04 PM UTC-7, Mo wrote:

It would be helpful to see the document size along with the parameters.
On Mon, Mar 17, 2014 at 2:27 PM, kumar.s...@gmail.com wrote:

I got 25K per second using 3 nodes 32G EC2 system.

I am also interested in something like 150k/sec of indexing speed, and
need help.

--
You received this message because you are subscribed to the Google
Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send
an email to elasticsearc...@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/ba9799e0-82e8-4337-bd11-e3fbdd9a5f4f%40googlegroups.comhttps://groups.google.com/d/msgid/elasticsearch/ba9799e0-82e8-4337-bd11-e3fbdd9a5f4f%40googlegroups.com?utm_medium=email&utm_source=footer
.

For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/aaa05eca-ff6a-4b8b-a8ba-9ded45d0c939%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Any update here? I'd be interested in hearing if you've been able to make progress on this problem, as I've been trying to solve a similar issue.