Elasticsearch JVM options

Unfortunately bulk indexing didn't give any improvements. There are still
~4000 logs per second.
Also I've tried to play with indices.memory.index_buffer_size - it doesn't
matter at all.

I'll try to tune merge policy as well and let you know the results.

As for the SMP monitoring, is there any open source equivalent?

пятница, 14 июня 2013 г., 16:04:29 UTC+4 пользователь Radu Gheorghe написал:

Hi,

On Fri, Jun 14, 2013 at 10:51 AM, kay kay <kay....@gmail.com <javascript:>

wrote:

Thanks for replies!

I've increased the logstash workers to 10, set shards to 7, updated jdk
from 6 to 7u21 and now I get ~4000 logs per second.

  • Your indices.memory.index_buffer_size looks high, did you make sure
    it makes indexing faster than the default setting (10%)?

I've tuned it following this article (
ElasticSearch and Logstash Tuning – Vaidas Jablonskis) which
says:

ES by default assumes that you're going to use it mostly for searching
and querying, so it allocates 90% of its allocated total HEAP
memory for searching, but my case was opposite - the goal is to index
vast amounts of logs as quickly as possible, so I changed that to 50/50.

  • If you can afford SSD disks, it can definitely help,

Disk write speed is not more than 10mb/sec.

Oh, and you don't mention if you are using bulk indexing. You should!

How and where should I enable it? I've found flush_size option in
logstash elasticsearch_http module, but this module is beta yet.

Yes, elasticsearch_http might give you better results, as you can tune the
flush_size.

It is beta, but it's there for a long time and used by many. I guess it's
up to you to test if it works well for your usecase and report bugs if not.

Talking about testing: I think the key to tuning your performance is
changing settings, trying again to see if performance differs, and doing
all that while monitoring your cluster. Hence Adrien's question on whether
you're sure the 50% index buffer size isn't too much. As Otis suggested,
you also need to know what's your bottleneck - you can check our SPMhttp://sematext.com/spm/elasticsearch-performance-monitoring/for monitoring. It's probably either CPU or I/O.

I assume increasing your number of shards should help (because it implies
more segments - you might also try tuning your merge policyhttp://www.elasticsearch.org/guide/reference/index-modules/merge/).
Also, try increasing the refresh_interval from the default 1 second. But
using bulks would probably give you the biggest gain.

Best regards,
Radu

http://sematext.com/ -- Elasticsearch -- Solr -- Lucene

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.