Hi all,
I was tuning my logstash instance working on
- pipeline-workers,
- pipeline-batch-size,
- elasticsearch output workers.
I run logstash several times over the days, every time killing -9 it when I collected enough performance metrics. However, the last time I tried to run it after changing elasticsearch output workers from 5 to 6 and Logstash got hung.
Actually it is running, it is populating logstash.log, but it is not processing events. Even if I kill it and try to start again, or rollback to the previous working configuration, will not solve the issue.
To better understand the issue I raised the log to debug level and captured an strace. However I am not able to fully understand what is happening, can you help me?
Thanks a lot,
Daniele
logstash logging keeps repeating the following lines:
{:timestamp=>"2016-05-25T10:57:20.155000+0200", :message=>"Flushing buffer at interval", :instance=>"#<LogStash::Outputs::ElasticSearch::Buffer:0x6129873 @operations_mutex=#<Mutex:0x8d566c1>, @max_size=500, @operations_lock=#<Java::JavaUtilConcurrentLocks::ReentrantLock:0xceaf303>, @submit_proc=#<Proc:0x5baba4b4@/opt/logstash/vendor/bundle/jruby/1.9/gems/logstash-output-elasticsearch-2.6.2-java/lib/logstash/outputs/elasticsearch/common.rb:57>, @logger=#<Cabin::Channel:0x786b226e @metrics=#<Cabin::Metrics:0x23c2b48b @metrics_lock=#<Mutex:0x29c793b5>, @metrics={}, @channel=#<Cabin::Channel:0x786b226e ...>>, @subscriber_lock=#<Mutex:0x2f4f1c3d>, @level=:debug, @subscribers={12528=>#<Cabin::Subscriber:0x38c6b84d @output=#<Cabin::Outputs::IO:0x2d89cedb @io=#<File:/var/log/logstash/logstash-debug-goretti.log>, @lock=#<Mutex:0x7a7081a3>>, @options={}>}, @data={}>, @last_flush=2016-05-25 10:57:19 +0200, @flush_interval=1, @stopping=#<Concurrent::AtomicBoolean:0x45466476>, @buffer=[], @flush_thread=#<Thread:0x48cfb843 run>>", :interval=>1, :level=>:debug, :file=>"logstash/outputs/elasticsearch/buffer.rb", :line=>"90", :method=>"interval_flush"}
{:timestamp=>"2016-05-25T10:57:20.681000+0200", :message=>"Pushing flush onto pipeline", :level=>:debug, :file=>"logstash/pipeline.rb", :line=>"458", :method=>"flush"}
{:timestamp=>"2016-05-25T10:57:20.684000+0200", :message=>"Flushing", :plugin=><LogStash::Filters::Metrics meter=>["events"], add_tag=>["metric"], flush_interval=>10, periodic_flush=>false, ignore_older_than=>0, clear_interval=>-1, rates=>[1, 5, 15], percentiles=>[1, 5, 10, 90, 95, 99, 100]>, :level=>:debug, :file=>"(eval)", :line=>"28", :method=>"initialize"}
{:timestamp=>"2016-05-25T10:57:21.160000+0200", :message=>"Flushing buffer at interval", :instance=>"#<LogStash::Outputs::ElasticSearch::Buffer:0x6129873 @operations_mutex=#<Mutex:0x8d566c1>, @max_size=500, @operations_lock=#<Java::JavaUtilConcurrentLocks::ReentrantLock:0xceaf303>, @submit_proc=#<Proc:0x5baba4b4@/opt/logstash/vendor/bundle/jruby/1.9/gems/logstash-output-elasticsearch-2.6.2-java/lib/logstash/outputs/elasticsearch/common.rb:57>, @logger=#<Cabin::Channel:0x786b226e @metrics=#<Cabin::Metrics:0x23c2b48b @metrics_lock=#<Mutex:0x29c793b5>, @metrics={}, @channel=#<Cabin::Channel:0x786b226e ...>>, @subscriber_lock=#<Mutex:0x2f4f1c3d>, @level=:debug, @subscribers={12528=>#<Cabin::Subscriber:0x38c6b84d @output=#<Cabin::Outputs::IO:0x2d89cedb @io=#<File:/var/log/logstash/logstash-debug-goretti.log>, @lock=#<Mutex:0x7a7081a3>>, @options={}>}, @data={}>, @last_flush=2016-05-25 10:57:20 +0200, @flush_interval=1, @stopping=#<Concurrent::AtomicBoolean:0x45466476>, @buffer=[], @flush_thread=#<Thread:0x48cfb843 run>>", :interval=>1, :level=>:debug, :file=>"logstash/outputs/elasticsearch/buffer.rb", :line=>"90", :method=>"interval_flush"}
strace -f keeps repeating the following lines
25041 futex(0x7f105002ca54, FUTEX_WAIT_BITSET_PRIVATE, 1, {1041214, 386230534}, ffffffff <unfinished ...>
25044 ioctl(5, SNDCTL_TMR_TIMEBASE or SNDRV_TIMER_IOCTL_NEXT_DEVICE or TCGETS, 0x7f10b19d8e90) = -1 ENOTTY (Inappropriate ioctl for device)
25044 write(5, "{:timestamp=>\"2016-05-25T10:57:1"..., 287) = 287
25044 stat("/home/user/ls-provaperformance/standard-query.log", {st_mode=S_IFREG|0644, st_size=3912294291, ...}) = 0
25044 futex(0x7f1050035154, FUTEX_WAIT_BITSET_PRIVATE, 1, {1041214, 386940726}, ffffffff <unfinished ...>
25070 <... futex resumed> ) = -1 ETIMEDOUT (Connection timed out)
25070 futex(0x7f105025fc28, FUTEX_WAKE_PRIVATE, 1) = 0
25070 futex(0x7f105025fc54, FUTEX_WAIT_BITSET_PRIVATE, 1, {1041213, 496346755}, ffffffff <unfinished ...>
25036 <... futex resumed> ) = -1 ETIMEDOUT (Connection timed out)
25036 futex(0x7f112c1e9728, FUTEX_WAKE_PRIVATE, 1) = 0
25036 futex(0x7f112c1e9754, FUTEX_WAIT_BITSET_PRIVATE, 1, {1041213, 472205457}, ffffffff) = -1 ETIMEDOUT (Connection timed out)
25036 futex(0x7f112c1e9728, FUTEX_WAKE_PRIVATE, 1) = 0
25036 futex(0x7f112c1e9754, FUTEX_WAIT_BITSET_PRIVATE, 1, {1041213, 522501456}, ffffffff <unfinished ...>