Logstash limitting ElasticSearch heap


(Antonio Augusto Santos) #1

Hello,

I think I'm hitting some kind of wall here... I'm running logstash on a
syslog server. It receives logs from about 150 machines and also a LOT of
iptables logs, and sending it to ElasticSearch. But, I think I'm not
hitting all speed that I should. My Logstash throughput tops at about 1.000
events/s, and it looks like my ES servers (I've 2) are really light.

On logstash I've three configs (syslog, ossec and iptables), so I get three
new nodes on my cluster. I've set up LS Heap Size to be 2G, but according
to bigdesk, the ES module is getting only about 150MB, and its generating a
LOT of GC.

Bellow the screenshot for big desk:

[image: bigdesk]
https://cloud.githubusercontent.com/assets/6423413/3261580/5bc4faf2-f25b-11e3-8529-df0eee61b1e5.png

And here the logstash process I'm running:

#ps -ef | grep logstash
logstash 13371 1 99 14:42 pts/0 00:29:37 /usr/bin/java -Djava.io.tmpdir=/opt/logstash/tmp -Xmx2g -XX:+UseParNewGC -XX:+UseConcMarkSweepGC -Djava.awt.headless=true -XX:CMSInitiatingOccupancyFraction=75 -XX:+UseCMSInitiatingOccupancyOnly -XX:+PrintGCDetails -XX:+PrintGCTimeStamps -XX:+PrintClassHistogram -XX:+PrintTenuringDistribution -XX:+PrintGCApplicationStoppedTime -Xloggc:./logstash-gc.log -jar /opt/logstash/vendor/jar/jruby-complete-1.7.11.jar -I/opt/logstash/lib /opt/logstash/lib/logstash/runner.rb agent -f /etc/logstash/conf.d -l /var/log/logstash/logstash.log

My Syslog/LS memory usage seens very light as well (its a 4 core VM), but
the logstash process is always topping in about 150% - 200%

*# free -m
total used free shared buffers cached
Mem: 7872 2076 5795 0 39 1502
-/+ buffers/cache: 534 7337
Swap: 1023 8 1015

uptime

15:02:04 up 23:52, 1 user, load average: 1.39, 1.12, 0.96*

Any ideas what I can do to increase the indexing performance?

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/b6346f68-1c17-4699-8ad0-cb9121b5c7cb%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


(Mark Walkom) #2

Lots of GC isn't bad, you want to see a lot of small GCs rather than
stop-the-world sort of ones which can bring your cluster down.

You can try increasing the index refresh interval - index.refresh_interval.
If you don't require "live" access, then increasing it to 60 seconds or
more will help.
If you can gist/pastebin a bit more info on your cluster, node specs,
versions, total indexes and size etc it may help.

Regards,
Mark Walkom

Infrastructure Engineer
Campaign Monitor
email: markw@campaignmonitor.com
web: www.campaignmonitor.com

On 18 June 2014 22:53, Antonio Augusto Santos mkhaos7@gmail.com wrote:

Hello,

I think I'm hitting some kind of wall here... I'm running logstash on a
syslog server. It receives logs from about 150 machines and also a LOT of
iptables logs, and sending it to ElasticSearch. But, I think I'm not
hitting all speed that I should. My Logstash throughput tops at about 1.000
events/s, and it looks like my ES servers (I've 2) are really light.

On logstash I've three configs (syslog, ossec and iptables), so I get
three new nodes on my cluster. I've set up LS Heap Size to be 2G, but
according to bigdesk, the ES module is getting only about 150MB, and its
generating a LOT of GC.

Bellow the screenshot for big desk:

[image: bigdesk]
https://cloud.githubusercontent.com/assets/6423413/3261580/5bc4faf2-f25b-11e3-8529-df0eee61b1e5.png

And here the logstash process I'm running:

#ps -ef | grep logstash
logstash 13371 1 99 14:42 pts/0 00:29:37 /usr/bin/java -Djava.io.tmpdir=/opt/logstash/tmp -Xmx2g -XX:+UseParNewGC -XX:+UseConcMarkSweepGC -Djava.awt.headless=true -XX:CMSInitiatingOccupancyFraction=75 -XX:+UseCMSInitiatingOccupancyOnly -XX:+PrintGCDetails -XX:+PrintGCTimeStamps -XX:+PrintClassHistogram -XX:+PrintTenuringDistribution -XX:+PrintGCApplicationStoppedTime -Xloggc:./logstash-gc.log -jar /opt/logstash/vendor/jar/jruby-complete-1.7.11.jar -I/opt/logstash/lib /opt/logstash/lib/logstash/runner.rb agent -f /etc/logstash/conf.d -l /var/log/logstash/logstash.log

My Syslog/LS memory usage seens very light as well (its a 4 core VM), but
the logstash process is always topping in about 150% - 200%

*# free -m
total used free shared buffers cached
Mem: 7872 2076 5795 0 39 1502
-/+ buffers/cache: 534 7337
Swap: 1023 8 1015

uptime

15:02:04 up 23:52, 1 user, load average: 1.39, 1.12, 0.96*

Any ideas what I can do to increase the indexing performance?

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/b6346f68-1c17-4699-8ad0-cb9121b5c7cb%40googlegroups.com
https://groups.google.com/d/msgid/elasticsearch/b6346f68-1c17-4699-8ad0-cb9121b5c7cb%40googlegroups.com?utm_medium=email&utm_source=footer
.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAEM624Zga8ixq9jVMvsvL9%2BO%2BPQNDUH3VCyubwdBrGLSRBE5eQ%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


(Antonio Augusto Santos) #3

Thanks for you response Mark.

I think I've finally fine tuned my scenario...
For starters, it helped me A LOT to set xms on Logstash to the same value
as LS_HEAP_SIZE. It really reduced the GC.

Second, I followed some tips form
http://jablonskis.org/2013/elasticsearch-and-logstash-tuning/index.html and
https://blog.codecentric.de/en/2014/05/elasticsearch-indexing-performance-cheatsheet/,
for increasing my indexing speed (search is second here).

After that I increased the number of workers on LS (had to change
/etc/init.d/logstash, since it was not respecting LS_WORKERS on
/etc/sysconfig/logstash). This made a big difference, and I could finally
see that the workers were being my bottleneck (with 3 workers my 4 cores
were hitting 100% usage all the time). So I increased my VM cores to 8, set
LS_WORKERS to 6, and set workers to 3 on the elasticsearch output. The
major boost came form these changes. And I could see LS is heavily CPU
dependent.

Last, but not least, I changed my log strategy. Instead of saving the logs
to disk with syslog and reading it back with LS, I got a setup a scenario
like http://cookbook.logstash.net/recipes/central-syslog/ and got my self a
redis server as a temp storage (for these logs I don't need logs on file,
ES will do just fine).

After that I've bumped my indexing speed from about 500 tps to about 4k
TPS.

Not bad :wink:

On Thursday, June 19, 2014 5:32:56 AM UTC-3, Mark Walkom wrote:

Lots of GC isn't bad, you want to see a lot of small GCs rather than
stop-the-world sort of ones which can bring your cluster down.

You can try increasing the index refresh interval

  • index.refresh_interval. If you don't require "live" access, then
    increasing it to 60 seconds or more will help.
    If you can gist/pastebin a bit more info on your cluster, node specs,
    versions, total indexes and size etc it may help.

Regards,
Mark Walkom

Infrastructure Engineer
Campaign Monitor
email: ma...@campaignmonitor.com <javascript:>
web: www.campaignmonitor.com

On 18 June 2014 22:53, Antonio Augusto Santos <mkh...@gmail.com
<javascript:>> wrote:

Hello,

I think I'm hitting some kind of wall here... I'm running logstash on a
syslog server. It receives logs from about 150 machines and also a LOT of
iptables logs, and sending it to ElasticSearch. But, I think I'm not
hitting all speed that I should. My Logstash throughput tops at about 1.000
events/s, and it looks like my ES servers (I've 2) are really light.

On logstash I've three configs (syslog, ossec and iptables), so I get
three new nodes on my cluster. I've set up LS Heap Size to be 2G, but
according to bigdesk, the ES module is getting only about 150MB, and its
generating a LOT of GC.

Bellow the screenshot for big desk:

[image: bigdesk]
https://cloud.githubusercontent.com/assets/6423413/3261580/5bc4faf2-f25b-11e3-8529-df0eee61b1e5.png

And here the logstash process I'm running:

#ps -ef | grep logstash
logstash 13371 1 99 14:42 pts/0 00:29:37 /usr/bin/java -Djava.io.tmpdir=/opt/logstash/tmp -Xmx2g -XX:+UseParNewGC -XX:+UseConcMarkSweepGC -Djava.awt.headless=true -XX:CMSInitiatingOccupancyFraction=75 -XX:+UseCMSInitiatingOccupancyOnly -XX:+PrintGCDetails -XX:+PrintGCTimeStamps -XX:+PrintClassHistogram -XX:+PrintTenuringDistribution -XX:+PrintGCApplicationStoppedTime -Xloggc:./logstash-gc.log -jar /opt/logstash/vendor/jar/jruby-complete-1.7.11.jar -I/opt/logstash/lib /opt/logstash/lib/logstash/runner.rb agent -f /etc/logstash/conf.d -l /var/log/logstash/logstash.log

My Syslog/LS memory usage seens very light as well (its a 4 core VM), but
the logstash process is always topping in about 150% - 200%

*# free -m
total used free shared buffers cached
Mem: 7872 2076 5795 0 39 1502
-/+ buffers/cache: 534 7337
Swap: 1023 8 1015

uptime

15:02:04 up 23:52, 1 user, load average: 1.39, 1.12, 0.96*

Any ideas what I can do to increase the indexing performance?

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearc...@googlegroups.com <javascript:>.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/b6346f68-1c17-4699-8ad0-cb9121b5c7cb%40googlegroups.com
https://groups.google.com/d/msgid/elasticsearch/b6346f68-1c17-4699-8ad0-cb9121b5c7cb%40googlegroups.com?utm_medium=email&utm_source=footer
.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/6727570c-085d-4e85-b3a3-f965b8c08d87%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


(Erica) #4

Antonio,

I have heard many people talking about setting the heap size in
elasticsearch, but I can't seem to figure out where to do this--I have
tried multiple ways, none of which seem to change the performance and
throughput, so I am assuming I have maybe implemented them incorrectly. If
you could point me in the right direction that would be great. I am using a
windows machine with 4GB RAM if that makes any difference.

Erica

On Thursday, June 19, 2014 9:43:36 PM UTC-4, Antonio Augusto Santos wrote:

Thanks for you response Mark.

I think I've finally fine tuned my scenario...
For starters, it helped me A LOT to set xms on Logstash to the same value
as LS_HEAP_SIZE. It really reduced the GC.

Second, I followed some tips form
http://jablonskis.org/2013/elasticsearch-and-logstash-tuning/index.html
and
https://blog.codecentric.de/en/2014/05/elasticsearch-indexing-performance-cheatsheet/,
for increasing my indexing speed (search is second here).

After that I increased the number of workers on LS (had to change
/etc/init.d/logstash, since it was not respecting LS_WORKERS on
/etc/sysconfig/logstash). This made a big difference, and I could finally
see that the workers were being my bottleneck (with 3 workers my 4 cores
were hitting 100% usage all the time). So I increased my VM cores to 8, set
LS_WORKERS to 6, and set workers to 3 on the elasticsearch output. The
major boost came form these changes. And I could see LS is heavily CPU
dependent.

Last, but not least, I changed my log strategy. Instead of saving the logs
to disk with syslog and reading it back with LS, I got a setup a scenario
like http://cookbook.logstash.net/recipes/central-syslog/ and got my self
a redis server as a temp storage (for these logs I don't need logs on file,
ES will do just fine).

After that I've bumped my indexing speed from about 500 tps to about 4k
TPS.

Not bad :wink:

On Thursday, June 19, 2014 5:32:56 AM UTC-3, Mark Walkom wrote:

Lots of GC isn't bad, you want to see a lot of small GCs rather than
stop-the-world sort of ones which can bring your cluster down.

You can try increasing the index refresh interval

  • index.refresh_interval. If you don't require "live" access, then
    increasing it to 60 seconds or more will help.
    If you can gist/pastebin a bit more info on your cluster, node specs,
    versions, total indexes and size etc it may help.

Regards,
Mark Walkom

Infrastructure Engineer
Campaign Monitor
email: ma...@campaignmonitor.com
web: www.campaignmonitor.com

On 18 June 2014 22:53, Antonio Augusto Santos mkh...@gmail.com wrote:

Hello,

I think I'm hitting some kind of wall here... I'm running logstash on a
syslog server. It receives logs from about 150 machines and also a LOT of
iptables logs, and sending it to ElasticSearch. But, I think I'm not
hitting all speed that I should. My Logstash throughput tops at about 1.000
events/s, and it looks like my ES servers (I've 2) are really light.

On logstash I've three configs (syslog, ossec and iptables), so I get
three new nodes on my cluster. I've set up LS Heap Size to be 2G, but
according to bigdesk, the ES module is getting only about 150MB, and its
generating a LOT of GC.

Bellow the screenshot for big desk:

[image: bigdesk]
https://cloud.githubusercontent.com/assets/6423413/3261580/5bc4faf2-f25b-11e3-8529-df0eee61b1e5.png

And here the logstash process I'm running:

#ps -ef | grep logstash
logstash 13371 1 99 14:42 pts/0 00:29:37 /usr/bin/java -Djava.io.tmpdir=/opt/logstash/tmp -Xmx2g -XX:+UseParNewGC -XX:+UseConcMarkSweepGC -Djava.awt.headless=true -XX:CMSInitiatingOccupancyFraction=75 -XX:+UseCMSInitiatingOccupancyOnly -XX:+PrintGCDetails -XX:+PrintGCTimeStamps -XX:+PrintClassHistogram -XX:+PrintTenuringDistribution -XX:+PrintGCApplicationStoppedTime -Xloggc:./logstash-gc.log -jar /opt/logstash/vendor/jar/jruby-complete-1.7.11.jar -I/opt/logstash/lib /opt/logstash/lib/logstash/runner.rb agent -f /etc/logstash/conf.d -l /var/log/logstash/logstash.log

My Syslog/LS memory usage seens very light as well (its a 4 core VM),
but the logstash process is always topping in about 150% - 200%

*# free -m
total used free shared buffers cached
Mem: 7872 2076 5795 0 39 1502
-/+ buffers/cache: 534 7337
Swap: 1023 8 1015

uptime

15:02:04 up 23:52, 1 user, load average: 1.39, 1.12, 0.96*

Any ideas what I can do to increase the indexing performance?

--
You received this message because you are subscribed to the Google
Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send
an email to elasticsearc...@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/b6346f68-1c17-4699-8ad0-cb9121b5c7cb%40googlegroups.com
https://groups.google.com/d/msgid/elasticsearch/b6346f68-1c17-4699-8ad0-cb9121b5c7cb%40googlegroups.com?utm_medium=email&utm_source=footer
.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/30e59c24-44fb-458f-8689-1d8ecf3f84c1%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


(Antonio Augusto Santos) #5

You should have to pass -Xmx2g to the .bat you are calling.
If this doesn't work you may have to set an environment variable called
ES_JAVA_OPTS, and set the Xmx there.

You can confirm any of these worked by looking at the resulting process and
see the value of Xmx.

Also it might be work taking a look at
http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/setup-service-win.html

On Thursday, July 3, 2014 4:05:03 PM UTC-3, Erica wrote:

Antonio,

I have heard many people talking about setting the heap size in
elasticsearch, but I can't seem to figure out where to do this--I have
tried multiple ways, none of which seem to change the performance and
throughput, so I am assuming I have maybe implemented them incorrectly. If
you could point me in the right direction that would be great. I am using a
windows machine with 4GB RAM if that makes any difference.

Erica

On Thursday, June 19, 2014 9:43:36 PM UTC-4, Antonio Augusto Santos wrote:

Thanks for you response Mark.

I think I've finally fine tuned my scenario...
For starters, it helped me A LOT to set xms on Logstash to the same value
as LS_HEAP_SIZE. It really reduced the GC.

Second, I followed some tips form
http://jablonskis.org/2013/elasticsearch-and-logstash-tuning/index.html
and
https://blog.codecentric.de/en/2014/05/elasticsearch-indexing-performance-cheatsheet/,
for increasing my indexing speed (search is second here).

After that I increased the number of workers on LS (had to change
/etc/init.d/logstash, since it was not respecting LS_WORKERS on
/etc/sysconfig/logstash). This made a big difference, and I could finally
see that the workers were being my bottleneck (with 3 workers my 4 cores
were hitting 100% usage all the time). So I increased my VM cores to 8, set
LS_WORKERS to 6, and set workers to 3 on the elasticsearch output. The
major boost came form these changes. And I could see LS is heavily CPU
dependent.

Last, but not least, I changed my log strategy. Instead of saving the
logs to disk with syslog and reading it back with LS, I got a setup a
scenario like http://cookbook.logstash.net/recipes/central-syslog/ and
got my self a redis server as a temp storage (for these logs I don't need
logs on file, ES will do just fine).

After that I've bumped my indexing speed from about 500 tps to about 4k
TPS.

Not bad :wink:

On Thursday, June 19, 2014 5:32:56 AM UTC-3, Mark Walkom wrote:

Lots of GC isn't bad, you want to see a lot of small GCs rather than
stop-the-world sort of ones which can bring your cluster down.

You can try increasing the index refresh interval

  • index.refresh_interval. If you don't require "live" access, then
    increasing it to 60 seconds or more will help.
    If you can gist/pastebin a bit more info on your cluster, node specs,
    versions, total indexes and size etc it may help.

Regards,
Mark Walkom

Infrastructure Engineer
Campaign Monitor
email: ma...@campaignmonitor.com
web: www.campaignmonitor.com

On 18 June 2014 22:53, Antonio Augusto Santos mkh...@gmail.com wrote:

Hello,

I think I'm hitting some kind of wall here... I'm running logstash on a
syslog server. It receives logs from about 150 machines and also a LOT of
iptables logs, and sending it to ElasticSearch. But, I think I'm not
hitting all speed that I should. My Logstash throughput tops at about 1.000
events/s, and it looks like my ES servers (I've 2) are really light.

On logstash I've three configs (syslog, ossec and iptables), so I get
three new nodes on my cluster. I've set up LS Heap Size to be 2G, but
according to bigdesk, the ES module is getting only about 150MB, and its
generating a LOT of GC.

Bellow the screenshot for big desk:

[image: bigdesk]
https://cloud.githubusercontent.com/assets/6423413/3261580/5bc4faf2-f25b-11e3-8529-df0eee61b1e5.png

And here the logstash process I'm running:

#ps -ef | grep logstash
logstash 13371 1 99 14:42 pts/0 00:29:37 /usr/bin/java -Djava.io.tmpdir=/opt/logstash/tmp -Xmx2g -XX:+UseParNewGC -XX:+UseConcMarkSweepGC -Djava.awt.headless=true -XX:CMSInitiatingOccupancyFraction=75 -XX:+UseCMSInitiatingOccupancyOnly -XX:+PrintGCDetails -XX:+PrintGCTimeStamps -XX:+PrintClassHistogram -XX:+PrintTenuringDistribution -XX:+PrintGCApplicationStoppedTime -Xloggc:./logstash-gc.log -jar /opt/logstash/vendor/jar/jruby-complete-1.7.11.jar -I/opt/logstash/lib /opt/logstash/lib/logstash/runner.rb agent -f /etc/logstash/conf.d -l /var/log/logstash/logstash.log

My Syslog/LS memory usage seens very light as well (its a 4 core VM),
but the logstash process is always topping in about 150% - 200%

*# free -m
total used free shared buffers cached
Mem: 7872 2076 5795 0 39 1502
-/+ buffers/cache: 534 7337
Swap: 1023 8 1015

uptime

15:02:04 up 23:52, 1 user, load average: 1.39, 1.12, 0.96*

Any ideas what I can do to increase the indexing performance?

--
You received this message because you are subscribed to the Google
Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send
an email to elasticsearc...@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/b6346f68-1c17-4699-8ad0-cb9121b5c7cb%40googlegroups.com
https://groups.google.com/d/msgid/elasticsearch/b6346f68-1c17-4699-8ad0-cb9121b5c7cb%40googlegroups.com?utm_medium=email&utm_source=footer
.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/bb091445-2d95-4a79-878c-9e4bd41d2bad%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


(system) #6