Will having multiple nodes in a cluster speed up indexing?

Kiran_Madabhushi · July 27, 2013, 12:49am

Hi
I am using elasticsearch and logstash for managing logs from a very big
system. Right now, during the test phase, I am using 1 machine. It indexes
about 2000 logs messages per second. If I have 10 million log messages, it
takes a few hours for the indexing to complete. (I dont know if its the
indexing thats taking a long time or logstash for filtering the messages).
If I a create a cluster of ES machines, will this speed up indexing? I am
not really concerned about replicas. Can I configure ES nodes to do just
the indexing part and not worry about replicas. Please suggest any
techniques

Thanks
Kiran

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

radu_gheorghe · July 30, 2013, 4:16pm

Hello Kiran,

Yes, you can change the number of replicas on the fly using the Update
Settings API:

So you can set the number of replicas to 0 and have only your primary
shards balanced across your cluster. If this case, adding more nodes will
help your indexing speed, as long as you have enough shards to spread on
all your nodes.

As Logstash uses daily indices by default, you'll probably want to make
sure that shards of today's index (which are hit with indexing requests)
are evenly distributed. A simple way of doing this is with the
index.routing.allocation.total_shards_per_node setting:

For example, if you have 10 shards per index (no replicas) and 5 nodes, set
that number to 2.

There are quite a lot of tricks to get your indexing speed up. Although,
there's almost always a trade-off. Here are the top 3 (IMO):

use the bulk API http://www.elasticsearch.org/guide/reference/api/bulk/.
The trade-off being you'll have to use the elasticsearch_http output
http://logstash.net/docs/1.1.13/outputs/elasticsearch_httpfor that, at
the moment
increase the refresh interval. Here's a blog
posthttp://blog.sematext.com/2013/07/08/elasticsearch-refresh-interval-vs-indexing-performance/about
it. The trade-off is that your searches will be less up-to-date
increase the indexing buffer
sizehttp://www.elasticsearch.org/guide/reference/modules/indices/.
The trade-off is you'll have less memory for stuff like searches

You may want to monitor your cluster to check out what work and what
doesn't, and where are your bottlenecks. If you're looking for a monitoring
tool for Elasticsearch, check out our SPM:

Best regards,
Radu

http://sematext.com/ -- Elasticsearch -- Solr -- Lucene

On Sat, Jul 27, 2013 at 3:49 AM, Kiran Madabhushi maskiran@gmail.comwrote:

Hi
I am using elasticsearch and logstash for managing logs from a very big
system. Right now, during the test phase, I am using 1 machine. It indexes
about 2000 logs messages per second. If I have 10 million log messages, it
takes a few hours for the indexing to complete. (I dont know if its the
indexing thats taking a long time or logstash for filtering the messages).
If I a create a cluster of ES machines, will this speed up indexing? I am
not really concerned about replicas. Can I configure ES nodes to do just
the indexing part and not worry about replicas. Please suggest any
techniques

Thanks
Kiran

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

Topic		Replies	Views
Setup of a cluster Elasticsearch	3	323	July 6, 2017
How to test ES cluster speedup Elasticsearch	1	352	July 6, 2017
Setting up elasticsearch to scale: shards per index Elasticsearch	9	480	July 6, 2017
Elasticsearch basic performance Elasticsearch	12	1212	December 26, 2017
Is there a source can explain deatiled thing about shards, nodes, clusters for better index and query for ELK? Elasticsearch	5	380	July 5, 2017

Will having multiple nodes in a cluster speed up indexing?

Best regards, Radu

Related topics

Best regards,
Radu