Moving to production considerations - applications with their clusters' infrastructure

rondelvictor · February 12, 2015, 2:27pm

Hi everyone,

I am considering moving one or several elasticsearch clusters to production.
Although Elasticsearch's documentation and community is great, I am
strongly startled not to find any complete use-case story stretching from
application(s) needs and data considerations to hardware ones.
Indeed, I understand why "what/how much hardware / configuration /
sharding" questions are systematically replied with both "it depends"
followed by "test".
But then, what about a few complete descriptions, out of so many
elasticsearch users, from data use case to cluster's internals, along with
a few performance and nodes stats?

So here are questions, before moving to production :

Are there any complete use cases around? Could you share some? By
complete I mean including at least some of the following :

Application needs and scope
Indexing Data indications : data volume, documents mapping,
documents / indexes volume
Searching Data indications : different applications, queries, use
of facets - filters - aggregations, concurrent indexing
Cluster Hardware : machines' hardware (RAM, Disks/SSD -
DAS-JBOD/SAN/NAS), JVM heap / OS Cache, nb of machines, back office network
Cluster Configuration : one or several indexes, sharding,
replication, master nodes, data nodes, use of over-sharding at start-up,
use of re-indexing
*Benchmaks *: queries response times, QPS, with or without concurrent
indexing, memory heap sweet spot, nodes stats

For those interested, here are the (not complete) best-among-very-few
exemples I've stumbled upon so far :

The very best (perfs with hardware and query description) :
http://fr.slideshare.net/charliejuggler/lucene-solrlondonug-meetup28nov2014-solr-es-performance
Hardware and master nodes heap :
https://groups.google.com/forum/?fromgroups#!searchin/elasticsearch/sizing/elasticsearch/V5BtrCGOqoU/l7x6vqMEx5YJ
6th slide - Hardware and storage with number of documents (well,
without indexes and documents storage volume nor RAM consumption) :

https://speakerdeck.com/bhaskarvk/scaling-elasticsearch-washington-dc-meetup
With JBOD / SAN storage discussion in "To Raid or not to Raid":

https://groups.google.com/forum/?fromgroups#!searchin/elasticsearch/hardware/elasticsearch/HSj2fZGdU1Y/4mFCBTCb-JcJ

Usual heap considerations in a real case :

https://codeascraft.com/2014/12/04/juggling-multiple-elasticsearch-instances-on-a-single-host/

Do not forget Elasticsearch awesome docs for moving to production
considerations :

http://www.elasticsearch.org/guide/en/elasticsearch/guide/current/administration.html

http://www.elasticsearch.org/guide/en/elasticsearch/guide/current/deploy.html

http://www.elasticsearch.org/guide/en/elasticsearch/guide/current/hardware.html

http://www.elasticsearch.org/guide/en/elasticsearch/guide/current/heap-sizing.html

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/927f60b1-8ae2-463e-b725-5f4f993905d9%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Charlie_Hull · February 13, 2015, 3:06pm

Hi,

Firstly, thanks for the kind words about the performance study: we're
hoping to revisit this soon after the feedback we've had on better tuning
for each engine. I agree there's a paucity of studies, but in a month or
two we should have one from a project we're working on to build an index of
product data - we hope to present this at the London Meetup group but will
publish the slides afterwards. Sorry this is in the future!

Cheers

Charlie

On 12 February 2015 at 14:27, rondelvictor@gmail.com wrote:

Hi everyone,

I am considering moving one or several elasticsearch clusters to
production.
Although Elasticsearch's documentation and community is great, I am
strongly startled not to find any complete use-case story stretching
from application(s) needs and data considerations to hardware ones.
Indeed, I understand why "what/how much hardware / configuration /
sharding" questions are systematically replied with both "it depends"
followed by "test".
But then, what about a few complete descriptions, out of so many
elasticsearch users, from data use case to cluster's internals, along with
a few performance and nodes stats?

So here are questions, before moving to production :

Are there any complete use cases around? Could you share some? By
complete I mean including at least some of the following :

Application needs and scope

Indexing Data indications : data volume, documents mapping,
documents / indexes volume

Searching Data indications : different applications, queries, use
of facets - filters - aggregations, concurrent indexing

Cluster Hardware : machines' hardware (RAM, Disks/SSD -
DAS-JBOD/SAN/NAS), JVM heap / OS Cache, nb of machines, back office network

Cluster Configuration : one or several indexes, sharding,
replication, master nodes, data nodes, use of over-sharding at start-up,
use of re-indexing

*Benchmaks *: queries response times, QPS, with or without
concurrent indexing, memory heap sweet spot, nodes stats

For those interested, here are the (not complete) best-among-very-few
exemples I've stumbled upon so far :

The very best (perfs with hardware and query description) :
Solr and Elasticsearch, a performance study | PPT

Hardware and master nodes heap :
Redirecting to Google Groups

6th slide - Hardware and storage with number of documents (well,
without indexes and documents storage volume nor RAM consumption) :

https://speakerdeck.com/bhaskarvk/scaling-elasticsearch-washington-dc-meetup
With JBOD / SAN storage discussion in "To Raid or not to Raid":

Redirecting to Google Groups

Usual heap considerations in a real case :

https://codeascraft.com/2014/12/04/juggling-multiple-elasticsearch-instances-on-a-single-host/

Do not forget Elasticsearch awesome docs for moving to production
considerations :

Elasticsearch Platform — Find real-time answers at scale | Elastic

Elasticsearch Platform — Find real-time answers at scale | Elastic

Elasticsearch Platform — Find real-time answers at scale | Elastic
http://www.elasticsearch.org/guide/en/elasticsearch/guide/current/hardware.html

Elasticsearch Platform — Find real-time answers at scale | Elastic
http://www.elasticsearch.org/guide/en/elasticsearch/guide/current/heap-sizing.html

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/927f60b1-8ae2-463e-b725-5f4f993905d9%40googlegroups.com
https://groups.google.com/d/msgid/elasticsearch/927f60b1-8ae2-463e-b725-5f4f993905d9%40googlegroups.com?utm_medium=email&utm_source=footer
.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAGe-ML%2B1EXPXxF6B1yr_ubcOFTgAPJQFUwJjY2b-Ffigv1zr%3Dg%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

rondelvictor · February 13, 2015, 6:02pm

Hi Charlie,

That's excellent news! Thank you for your slideshare and related github!

Regards,

Victor

Le vendredi 13 février 2015 16:07:11 UTC+1, Charlie Hull a écrit :

Hi,

Firstly, thanks for the kind words about the performance study: we're
hoping to revisit this soon after the feedback we've had on better tuning
for each engine. I agree there's a paucity of studies, but in a month or
two we should have one from a project we're working on to build an index of
product data - we hope to present this at the London Meetup group but will
publish the slides afterwards. Sorry this is in the future!

Cheers

Charlie

On 12 February 2015 at 14:27, <rondel...@gmail.com <javascript:>> wrote:

Hi everyone,

I am considering moving one or several elasticsearch clusters to
production.
Although Elasticsearch's documentation and community is great, I am
strongly startled not to find any complete use-case story stretching
from application(s) needs and data considerations to hardware ones.
Indeed, I understand why "what/how much hardware / configuration /
sharding" questions are systematically replied with both "it depends"
followed by "test".
But then, what about a few complete descriptions, out of so many
elasticsearch users, from data use case to cluster's internals, along with
a few performance and nodes stats?

So here are questions, before moving to production :

Are there any complete use cases around? Could you share some? By
complete I mean including at least some of the following :

Application needs and scope

Indexing Data indications : data volume, documents mapping,
documents / indexes volume

Searching Data indications : different applications, queries,
use of facets - filters - aggregations, concurrent indexing

Cluster Hardware : machines' hardware (RAM, Disks/SSD -
DAS-JBOD/SAN/NAS), JVM heap / OS Cache, nb of machines, back office network

Cluster Configuration : one or several indexes, sharding,
replication, master nodes, data nodes, use of over-sharding at start-up,
use of re-indexing

*Benchmaks *: queries response times, QPS, with or without
concurrent indexing, memory heap sweet spot, nodes stats

For those interested, here are the (not complete) best-among-very-few
exemples I've stumbled upon so far :

The very best (perfs with hardware and query description) :
Solr and Elasticsearch, a performance study | PPT

Hardware and master nodes heap :
Redirecting to Google Groups

6th slide - Hardware and storage with number of documents (well,
without indexes and documents storage volume nor RAM consumption) :

https://speakerdeck.com/bhaskarvk/scaling-elasticsearch-washington-dc-meetup
With JBOD / SAN storage discussion in "To Raid or not to Raid":

Redirecting to Google Groups

Usual heap considerations in a real case :

https://codeascraft.com/2014/12/04/juggling-multiple-elasticsearch-instances-on-a-single-host/

Do not forget Elasticsearch awesome docs for moving to production
considerations :

Elasticsearch Platform — Find real-time answers at scale | Elastic

Elasticsearch Platform — Find real-time answers at scale | Elastic

Elasticsearch Platform — Find real-time answers at scale | Elastic
http://www.elasticsearch.org/guide/en/elasticsearch/guide/current/hardware.html

Elasticsearch Platform — Find real-time answers at scale | Elastic
http://www.elasticsearch.org/guide/en/elasticsearch/guide/current/heap-sizing.html

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearc...@googlegroups.com <javascript:>.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/927f60b1-8ae2-463e-b725-5f4f993905d9%40googlegroups.com
https://groups.google.com/d/msgid/elasticsearch/927f60b1-8ae2-463e-b725-5f4f993905d9%40googlegroups.com?utm_medium=email&utm_source=footer
.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/34acb9f4-a622-4ed6-b877-d605f0c6382b%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Topic		Replies	Views
Use cases - Production examples: datas, queries, cluster hardware and conf, and statistics Elasticsearch	1	409	July 6, 2017
Elasticsearch hardware requirement,and benchmarking Elasticsearch	10	2577	July 6, 2017
Hardware recommendations Elasticsearch	8	517	July 6, 2017
Case studies of successful ES clusters in production Elasticsearch	5	769	July 5, 2017
Windows Elasticsearch cluster performance tuning Elasticsearch	5	1285	July 6, 2017

Moving to production considerations - applications with their clusters' infrastructure

http://www.elasticsearch.org/guide/en/elasticsearch/guide/current/administration.html

http://www.elasticsearch.org/guide/en/elasticsearch/guide/current/deploy.html

http://www.elasticsearch.org/guide/en/elasticsearch/guide/current/hardware.html

Elasticsearch Platform — Find real-time answers at scale | Elastic

Elasticsearch Platform — Find real-time answers at scale | Elastic

Related topics