ElasticSearch setup

luizgpsantos · March 6, 2012, 2:27am

Hi guys,

We are working in a project to analyze a possible change of a search
solution based in Fast to one based in ElasticSearch. We
are analyzing several possible configurations and we have some doubts on
how to proceed with our setup.

The first one is which is the method do calculate the number of shards and
replicas. Is there any rule to follow considering the size of the index and
memory and hard disk availability?

Another point we are studying is if there is any way to separe the
indexing servers from the search servers. It seams that we could active
this by putting the primary shards in one server group and the search
server would have only the replicas. Does it make sense? If so how can we
make this configuration?

We will have a set of indexes separated by several products. Do we have
to guarantee the allocation of data in the same shard using cluster routing
allocation or is there a better way to do that?

Our cluster will consist in 6 server with Intel L5640 of 6 cores, 64 Gb RAM
and 6 discs SAS-2 of 300 GB RAID5. Can we active a good performance results
with this configuration? Which is the best way to use this server with
Elasticsearch?

--
Luiz Guilherme P. Santos

Karussell1 · March 6, 2012, 8:21am

The first one is which is the method do calculate the number of shards
and replicas. Is there any rule to follow considering the size of the index
and memory and hard disk availability?

how many machines/indices are you using for FAST? what is you index size,
query requirements etc

Another point we are studying is if there is any way to separe the
indexing servers from the search servers.

Why do you want to do this?

Peter.

On Tuesday, March 6, 2012 3:27:43 AM UTC+1, Luiz Guilherme wrote:

Hi guys,

We are working in a project to analyze a possible change of a search
solution based in Fast to one based in Elasticsearch. We
are analyzing several possible configurations and we have some doubts on
how to proceed with our setup.

The first one is which is the method do calculate the number of shards and
replicas. Is there any rule to follow considering the size of the index and
memory and hard disk availability?

Another point we are studying is if there is any way to separe the
indexing servers from the search servers. It seams that we could active
this by putting the primary shards in one server group and the search
server would have only the replicas. Does it make sense? If so how can we
make this configuration?

We will have a set of indexes separated by several products. Do we have
to guarantee the allocation of data in the same shard using cluster routing
allocation or is there a better way to do that?

Our cluster will consist in 6 server with Intel L5640 of 6 cores, 64 Gb
RAM and 6 discs SAS-2 of 300 GB RAID5. Can we active a good performance
results with this configuration? Which is the best way to use this server
with Elasticsearch?

--
Luiz Guilherme P. Santos

haarts · March 6, 2012, 10:25am

(Disclaimer I consider myself slightly above ES n00b)
What is the performance you're looking for? To me the servers you have
should handle almost anything you throw at it.

There is no way to calculate what amount of documents a shard can handle.
The general advise is to load a bunch of data in a one shard setup and
monitor disk and memory (BigDesk GitHub - lukas-vlcek/bigdesk: Live charts and statistics for Elasticsearch cluster. is
an easy way of monitoring, also Elasticsearch-head
http://mobz.github.com/elasticsearch-head/).

I don't see a point in splitting the search and index servers, unless you
really like doing ops.

I'm not sure about the multi index data allocation question.

Hope this helps.

On Tuesday, 6 March 2012 03:27:43 UTC+1, Luiz Guilherme wrote:

Hi guys,

We are working in a project to analyze a possible change of a search
solution based in Fast to one based in Elasticsearch. We
are analyzing several possible configurations and we have some doubts on
how to proceed with our setup.

The first one is which is the method do calculate the number of shards and
replicas. Is there any rule to follow considering the size of the index and
memory and hard disk availability?

Another point we are studying is if there is any way to separe the
indexing servers from the search servers. It seams that we could active
this by putting the primary shards in one server group and the search
server would have only the replicas. Does it make sense? If so how can we
make this configuration?

We will have a set of indexes separated by several products. Do we have
to guarantee the allocation of data in the same shard using cluster routing
allocation or is there a better way to do that?

Our cluster will consist in 6 server with Intel L5640 of 6 cores, 64 Gb
RAM and 6 discs SAS-2 of 300 GB RAID5. Can we active a good performance
results with this configuration? Which is the best way to use this server
with Elasticsearch?

--
Luiz Guilherme P. Santos

kimchy · March 6, 2012, 7:58pm

You can't really split search and index servers in elasticsearch. Since you want to have (near) real time search, indexing happens also on the replica shards as well as the primary shards (and those can change as machines come and go). See more info here: Elasticsearch Platform — Find real-time answers at scale | Elastic.

Regarding the number of shards, that really depends on how your data flows, and how much data you index, and because of the very broad aspect of document types, indexing mappings, and the like, its kindda hard to answer. The suggested method of using a single shard and hammering it is a good one.

Your setup sounds good, but without knowing a bit more on the data and type of searches executed, its hard to give recommendations.

On Tuesday, March 6, 2012 at 12:25 PM, haarts wrote:

(Disclaimer I consider myself slightly above ES n00b)
What is the performance you're looking for? To me the servers you have should handle almost anything you throw at it.

There is no way to calculate what amount of documents a shard can handle. The general advise is to load a bunch of data in a one shard setup and monitor disk and memory (BigDesk GitHub - lukas-vlcek/bigdesk: Live charts and statistics for Elasticsearch cluster. is an easy way of monitoring, also Elasticsearch-head http://mobz.github.com/elasticsearch-head/).

I don't see a point in splitting the search and index servers, unless you really like doing ops.

I'm not sure about the multi index data allocation question.

Hope this helps.

On Tuesday, 6 March 2012 03:27:43 UTC+1, Luiz Guilherme wrote:

Hi guys,

We are working in a project to analyze a possible change of a search solution based in Fast to one based in Elasticsearch. We are analyzing several possible configurations and we have some doubts on how to proceed with our setup.

The first one is which is the method do calculate the number of shards and replicas. Is there any rule to follow considering the size of the index and memory and hard disk availability?

Another point we are studying is if there is any way to separe the indexing servers from the search servers. It seams that we could active this by putting the primary shards in one server group and the search server would have only the replicas. Does it make sense? If so how can we make this configuration?

We will have a set of indexes separated by several products. Do we have to guarantee the allocation of data in the same shard using cluster routing allocation or is there a better way to do that?

Our cluster will consist in 6 server with Intel L5640 of 6 cores, 64 Gb RAM and 6 discs SAS-2 of 300 GB RAID5. Can we active a good performance results with this configuration? Which is the best way to use this server with Elasticsearch?

--
Luiz Guilherme P. Santos

Topic		Replies	Views
Few queries on setting up a high performing and scalable ES setup Elasticsearch	3	327	July 6, 2017
Elasticsearch cluster setup Elasticsearch	4	391	July 6, 2017
Advice on cluster configuration Elasticsearch	10	553	January 8, 2019
SSD and one replica vs HDD and more replicas Elasticsearch	10	3069	July 5, 2017
Shards and replicas allocation in elasticsearch Elasticsearch	7	475	December 17, 2018

ElasticSearch setup

Related topics