Scaling to 900 nodes or more, risks and pitfalls?

Hi,

the company I work for plans to scale our ES cluster from 20 nodes to about
900 or perhaps more, because of a higher data volume.

Does anyone has experience with clusters of that size, is it even possible
and can I expect the indexing and search time to be the same, increase or
decrease.
If it is possible, are there any pitfalls that one has to avoid?

Best Regards,

Kim

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

Judging by others on the mailing list, I have not heard of a cluster of
that size. Mozilla perhaps, but maybe not even them.

Will all 900 nodes be part of the same cluster? The network chatter might
be large. IMHO, if you can afford 900 nodes, then you can afford to use
elasticsearch's own professional services. :slight_smile:

Cheers,

Ivan

On Fri, Nov 22, 2013 at 7:28 AM, Kim Laplume k.laplume@iterend.com wrote:

Hi,

the company I work for plans to scale our ES cluster from 20 nodes to
about 900 or perhaps more, because of a higher data volume.

Does anyone has experience with clusters of that size, is it even possible
and can I expect the indexing and search time to be the same, increase or
decrease.
If it is possible, are there any pitfalls that one has to avoid?

Best Regards,

Kim

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

Hi,

I've heard of 1000-node SolrCloud indices that worked, so I assume ES may
be OK there, too. I'd expect a good amount of work, careful configuring
and tuning for such a massive system. How much data, what sort of data,
and what sort of query complexity and rate are we talking about here?

Otis

Performance Monitoring * Log Analytics * Search Analytics
Solr & Elasticsearch Support * http://sematext.com/

On Friday, November 22, 2013 10:28:43 AM UTC-5, Kim Laplume wrote:

Hi,

the company I work for plans to scale our ES cluster from 20 nodes to
about 900 or perhaps more, because of a higher data volume.

Does anyone has experience with clusters of that size, is it even possible
and can I expect the indexing and search time to be the same, increase or
decrease.
If it is possible, are there any pitfalls that one has to avoid?

Best Regards,

Kim

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

Not directly related to ES cluster size, but still relevant to your goal;
You'll want to invest time into something like Puppet/Chef and couple that
with a module such as the ES puppet module, you'll find provisioning,
extending and generally managing your cluster a lot simpler.

Regards,
Mark Walkom

Infrastructure Engineer
Campaign Monitor
email: markw@campaignmonitor.com
web: www.campaignmonitor.com

On 23 November 2013 10:01, Otis Gospodnetic otis.gospodnetic@gmail.comwrote:

Hi,

I've heard of 1000-node SolrCloud indices that worked, so I assume ES may
be OK there, too. I'd expect a good amount of work, careful configuring
and tuning for such a massive system. How much data, what sort of data,
and what sort of query complexity and rate are we talking about here?

Otis

Performance Monitoring * Log Analytics * Search Analytics
Solr & Elasticsearch Support * http://sematext.com/

On Friday, November 22, 2013 10:28:43 AM UTC-5, Kim Laplume wrote:

Hi,

the company I work for plans to scale our ES cluster from 20 nodes to
about 900 or perhaps more, because of a higher data volume.

Does anyone has experience with clusters of that size, is it even
possible and can I expect the indexing and search time to be the same,
increase or decrease.
If it is possible, are there any pitfalls that one has to avoid?

Best Regards,

Kim

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

Yes, but you need to be careful not to let puppet/chef restart your nodes without proper coordination. Right now we use puppet for deployment and configuration but manually groom indexes off of each node before bouncing it.

Sent from my iPhone

On Nov 24, 2013, at 5:27 PM, Mark Walkom markw@campaignmonitor.com wrote:

Not directly related to ES cluster size, but still relevant to your goal; You'll want to invest time into something like Puppet/Chef and couple that with a module such as the ES puppet module, you'll find provisioning, extending and generally managing your cluster a lot simpler.

Regards,
Mark Walkom

Infrastructure Engineer
Campaign Monitor
email: markw@campaignmonitor.com
web: www.campaignmonitor.com

On 23 November 2013 10:01, Otis Gospodnetic otis.gospodnetic@gmail.com wrote:
Hi,

I've heard of 1000-node SolrCloud indices that worked, so I assume ES may be OK there, too. I'd expect a good amount of work, careful configuring and tuning for such a massive system. How much data, what sort of data, and what sort of query complexity and rate are we talking about here?

Otis

Performance Monitoring * Log Analytics * Search Analytics
Solr & Elasticsearch Support * http://sematext.com/

On Friday, November 22, 2013 10:28:43 AM UTC-5, Kim Laplume wrote:
Hi,

the company I work for plans to scale our ES cluster from 20 nodes to about 900 or perhaps more, because of a higher data volume.

Does anyone has experience with clusters of that size, is it even possible and can I expect the indexing and search time to be the same, increase or decrease.
If it is possible, are there any pitfalls that one has to avoid?

Best Regards,

Kim

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

Thank you all for your feedback, I will take it into consideration

On Sun, Nov 24, 2013 at 11:38 PM, Nik Everett nik9000@gmail.com wrote:

Yes, but you need to be careful not to let puppet/chef restart your nodes
without proper coordination. Right now we use puppet for deployment and
configuration but manually groom indexes off of each node before bouncing
it.

Sent from my iPhone

On Nov 24, 2013, at 5:27 PM, Mark Walkom markw@campaignmonitor.com
wrote:

Not directly related to ES cluster size, but still relevant to your goal;
You'll want to invest time into something like Puppet/Chef and couple that
with a module such as the ES puppet module, you'll find provisioning,
extending and generally managing your cluster a lot simpler.

Regards,
Mark Walkom

Infrastructure Engineer
Campaign Monitor
email: markw@campaignmonitor.com
web: www.campaignmonitor.com

On 23 November 2013 10:01, Otis Gospodnetic otis.gospodnetic@gmail.comwrote:

Hi,

I've heard of 1000-node SolrCloud indices that worked, so I assume ES may
be OK there, too. I'd expect a good amount of work, careful configuring
and tuning for such a massive system. How much data, what sort of data,
and what sort of query complexity and rate are we talking about here?

Otis

Performance Monitoring * Log Analytics * Search Analytics
Solr & Elasticsearch Support * http://sematext.com/

On Friday, November 22, 2013 10:28:43 AM UTC-5, Kim Laplume wrote:

Hi,

the company I work for plans to scale our ES cluster from 20 nodes to
about 900 or perhaps more, because of a higher data volume.

Does anyone has experience with clusters of that size, is it even
possible and can I expect the indexing and search time to be the same,
increase or decrease.
If it is possible, are there any pitfalls that one has to avoid?

Best Regards,

Kim

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to a topic in the
Google Groups "elasticsearch" group.
To unsubscribe from this topic, visit
https://groups.google.com/d/topic/elasticsearch/EuuA50LHfrk/unsubscribe.
To unsubscribe from this group and all its topics, send an email to
elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

--

*Kim Laplume *| Software Engineer

kim@talkwalker.com

[image: cid:image004.png@01CE19AF.C0122970] http://www.talkwalker.com/[image:
cid:image005.png@01CE19AF.C0122970]
https://www.facebook.com/talkwalker[image:
cid:image006.png@01CE19AF.C0122970] https://twitter.com/talkwalker[image:
cid:image007.png@01CE19AF.C0122970]https://www.linkedin.com/company/talkwalker[image:
cid:image008.png@01CE19AF.C0122970]https://plus.google.com/u/0/117412412458944231098/posts

A product of Trendiction S.A.
14, rue Aldringen | L- 1118 Luxembourg

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAEjR1hNY7AiC%2BGAd%3DRJDTH%3DTrakun6b-FQdOMi%3DoAWA6OgjiEQ%40mail.gmail.com.
For more options, visit https://groups.google.com/groups/opt_out.

If you go ahead with this, I know there are a lot of people on the list
(including me) that would be interested in keeping up to date with your
progress.

Regards,
Mark Walkom

Infrastructure Engineer
Campaign Monitor
email: markw@campaignmonitor.com
web: www.campaignmonitor.com

On 28 November 2013 00:42, Kim Laplume kim@talkwalker.com wrote:

Thank you all for your feedback, I will take it into consideration

On Sun, Nov 24, 2013 at 11:38 PM, Nik Everett nik9000@gmail.com wrote:

Yes, but you need to be careful not to let puppet/chef restart your nodes
without proper coordination. Right now we use puppet for deployment and
configuration but manually groom indexes off of each node before bouncing
it.

Sent from my iPhone

On Nov 24, 2013, at 5:27 PM, Mark Walkom markw@campaignmonitor.com
wrote:

Not directly related to ES cluster size, but still relevant to your goal;
You'll want to invest time into something like Puppet/Chef and couple that
with a module such as the ES puppet module, you'll find provisioning,
extending and generally managing your cluster a lot simpler.

Regards,
Mark Walkom

Infrastructure Engineer
Campaign Monitor
email: markw@campaignmonitor.com
web: www.campaignmonitor.com

On 23 November 2013 10:01, Otis Gospodnetic otis.gospodnetic@gmail.comwrote:

Hi,

I've heard of 1000-node SolrCloud indices that worked, so I assume ES
may be OK there, too. I'd expect a good amount of work, careful
configuring and tuning for such a massive system. How much data, what sort
of data, and what sort of query complexity and rate are we talking about
here?

Otis

Performance Monitoring * Log Analytics * Search Analytics
Solr & Elasticsearch Support * http://sematext.com/

On Friday, November 22, 2013 10:28:43 AM UTC-5, Kim Laplume wrote:

Hi,

the company I work for plans to scale our ES cluster from 20 nodes to
about 900 or perhaps more, because of a higher data volume.

Does anyone has experience with clusters of that size, is it even
possible and can I expect the indexing and search time to be the same,
increase or decrease.
If it is possible, are there any pitfalls that one has to avoid?

Best Regards,

Kim

--
You received this message because you are subscribed to the Google
Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send
an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to a topic in the
Google Groups "elasticsearch" group.
To unsubscribe from this topic, visit
https://groups.google.com/d/topic/elasticsearch/EuuA50LHfrk/unsubscribe.
To unsubscribe from this group and all its topics, send an email to
elasticsearch+unsubscribe@googlegroups.com.

For more options, visit https://groups.google.com/groups/opt_out.

--

*Kim Laplume *| Software Engineer

kim@talkwalker.com

[image: cid:image004.png@01CE19AF.C0122970] http://www.talkwalker.com/[image:
cid:image005.png@01CE19AF.C0122970] https://www.facebook.com/talkwalker[image:
cid:image006.png@01CE19AF.C0122970] https://twitter.com/talkwalker[image:
cid:image007.png@01CE19AF.C0122970]https://www.linkedin.com/company/talkwalker[image:
cid:image008.png@01CE19AF.C0122970]https://plus.google.com/u/0/117412412458944231098/posts

A product of Trendiction S.A.
14, rue Aldringen | L- 1118 Luxembourg

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/CAEjR1hNY7AiC%2BGAd%3DRJDTH%3DTrakun6b-FQdOMi%3DoAWA6OgjiEQ%40mail.gmail.com
.

For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAEM624ZmdK-ZCwBUQYv1UAGyucCMe_cNCH2KCRv9Xdi6G%3D9ncg%40mail.gmail.com.
For more options, visit https://groups.google.com/groups/opt_out.

I can't imagine 1000 nodes (multi-PetaByte?) of full-text search so it
probably is for bigdata.
Why not create a lambda architecture ?

Use elastcisearch for the serve layer (that's what i do) and hadoop as
batch layer) to :

Create a speed layer for realtime data if needed. (eg : counter in redis)

--
Laurent Laborde
Bigdata Hacker

On Wednesday, November 27, 2013 9:59:21 PM UTC+1, Mark Walkom wrote:

If you go ahead with this, I know there are a lot of people on the list
(including me) that would be interested in keeping up to date with your
progress.

Regards,
Mark Walkom

Infrastructure Engineer
Campaign Monitor
email: ma...@campaignmonitor.com <javascript:>
web: www.campaignmonitor.com

On 28 November 2013 00:42, Kim Laplume <k...@talkwalker.com <javascript:>>wrote:

Thank you all for your feedback, I will take it into consideration

On Sun, Nov 24, 2013 at 11:38 PM, Nik Everett <nik...@gmail.com<javascript:>

wrote:

Yes, but you need to be careful not to let puppet/chef restart your
nodes without proper coordination. Right now we use puppet for deployment
and configuration but manually groom indexes off of each node before
bouncing it.

Sent from my iPhone

On Nov 24, 2013, at 5:27 PM, Mark Walkom <ma...@campaignmonitor.com<javascript:>>
wrote:

Not directly related to ES cluster size, but still relevant to your
goal; You'll want to invest time into something like Puppet/Chef and couple
that with a module such as the ES puppet module, you'll find provisioning,
extending and generally managing your cluster a lot simpler.

Regards,
Mark Walkom

Infrastructure Engineer
Campaign Monitor
email: ma...@campaignmonitor.com <javascript:>
web: www.campaignmonitor.com

On 23 November 2013 10:01, Otis Gospodnetic <otis.gos...@gmail.com<javascript:>

wrote:

Hi,

I've heard of 1000-node SolrCloud indices that worked, so I assume ES
may be OK there, too. I'd expect a good amount of work, careful
configuring and tuning for such a massive system. How much data, what sort
of data, and what sort of query complexity and rate are we talking about
here?

Otis

Performance Monitoring * Log Analytics * Search Analytics
Solr & Elasticsearch Support * http://sematext.com/

On Friday, November 22, 2013 10:28:43 AM UTC-5, Kim Laplume wrote:

Hi,

the company I work for plans to scale our ES cluster from 20 nodes to
about 900 or perhaps more, because of a higher data volume.

Does anyone has experience with clusters of that size, is it even
possible and can I expect the indexing and search time to be the same,
increase or decrease.
If it is possible, are there any pitfalls that one has to avoid?

Best Regards,

Kim

--
You received this message because you are subscribed to the Google
Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send
an email to elasticsearc...@googlegroups.com <javascript:>.
For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to the Google
Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send
an email to elasticsearc...@googlegroups.com <javascript:>.
For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to a topic in the
Google Groups "elasticsearch" group.
To unsubscribe from this topic, visit
https://groups.google.com/d/topic/elasticsearch/EuuA50LHfrk/unsubscribe.
To unsubscribe from this group and all its topics, send an email to
elasticsearc...@googlegroups.com <javascript:>.

For more options, visit https://groups.google.com/groups/opt_out.

--

*Kim Laplume *| Software Engineer

k...@talkwalker.com <javascript:>

[image: cid:image004.png@01CE19AF.C0122970] http://www.talkwalker.com/[image:
cid:image005.png@01CE19AF.C0122970] https://www.facebook.com/talkwalker[image:
cid:image006.png@01CE19AF.C0122970] https://twitter.com/talkwalker[image:
cid:image007.png@01CE19AF.C0122970]https://www.linkedin.com/company/talkwalker[image:
cid:image008.png@01CE19AF.C0122970]https://plus.google.com/u/0/117412412458944231098/posts

A product of Trendiction S.A.
14, rue Aldringen | L- 1118 Luxembourg

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearc...@googlegroups.com <javascript:>.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/CAEjR1hNY7AiC%2BGAd%3DRJDTH%3DTrakun6b-FQdOMi%3DoAWA6OgjiEQ%40mail.gmail.com
.

For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/6ff7e2ea-d51f-484b-9a8f-dfdc57a5e0a6%40googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.