Elastic stack - get last message which fails on logstash

Mojster · September 18, 2017, 12:27pm

I'm indexing custom longs into Logstash.
In some file I'm getting timeouts & I can't determine which line is causing them.
As I can se, Filebeats is shipping them in batches.

Is there any option to get this single message from LS.
I didn't get any further with stdout codec and json/rubydebug applied.

My error, if it's any additional help:

2017-09-18T14:16:01+02:00 ERR Failed to publish events caused by: read tcp [::1]:53202->[::1]:5044: i/o timeout
2017-09-18T14:16:01+02:00 INFO Error publishing events (retrying): read tcp [::1]:53202->[::1]:5044: i/o timeout

It jams LS so much that I have to restart the whole server.

Mojster · September 19, 2017, 6:07am

Here's the error message from Logstash:

[2017-09-19T07:49:37,099][ERROR][logstash.outputs.elasticsearch] Encountered an unexpected error submitting a bulk request! Will retry. {:error_message=>"incompatible encodings: Windows-1250 and UTF-8", :class=>"Encoding::CompatibilityError", :backtrace=>["e:/logstash/vendor/bundle/jruby/1.9/gems/logstash-output-elasticsearch-7.4.0-java/lib/logstash/outputs/elasticsearch/common.rb:149:in `submit'", "org/jruby/RubyArray.java:1613:in `each'", "org/jruby/RubyEnumerable.java:971:in `each_with_index'", "e:/logstash/vendor/bundle/jruby/1.9/gems/logstash-output-elasticsearch-7.4.0-java/lib/logstash/outputs/elasticsearch/common.rb:127:in `submit'", "e:/logstash/vendor/bundle/jruby/1.9/gems/logstash-output-elasticsearch-7.4.0-java/lib/logstash/outputs/elasticsearch/common.rb:87:in `retrying_submit'", "e:/logstash/vendor/bundle/jruby/1.9/gems/logstash-output-elasticsearch-7.4.0-java/lib/logstash/outputs/elasticsearch/common.rb:38:in `multi_receive'", "e:/logstash/logstash-core/lib/logstash/output_delegator_strategies/shared.rb:13:in `multi_receive'", "e:/logstash/logstash-core/lib/logstash/output_delegator.rb:49:in `multi_receive'", "e:/logstash/logstash-core/lib/logstash/pipeline.rb:436:in `output_batch'", "org/jruby/RubyHash.java:1342:in `each'", "e:/logstash/logstash-core/lib/logstash/pipeline.rb:435:in `output_batch'", "e:/logstash/logstash-core/lib/logstash/pipeline.rb:381:in `worker_loop'", "e:/logstash/logstash-core/lib/logstash/pipeline.rb:342:in `start_workers'"]}

I'm currently running pipeline.workers:1, pipeline.batch.size:1 and log.level: trace and I still can't pinpoint a line with errors. This file is generated on OpenVMS system and pushed to Windows over FTP/SFTP. If I open it and do a save as, it's working.
Other files are working fine and there generated the same way.
I want to pinpoint the row so we can investigate further on VMS.

jogoinar10 · September 19, 2017, 6:45am

What is the version of your logstash?

Mojster · September 19, 2017, 7:39am

Latest: 5.6.0
This was also a problem on 5.5.2. Didn't notice it earlier.

I found a row, which I'm assuming. Because there's a date wrong written.
But if this is the case it should not throw an error like this.

jogoinar10 · September 19, 2017, 8:07am

Can you share your logstash config?

warkolm · September 19, 2017, 8:10am

What's in the Elasticsearch logs at that time?

Mojster · September 19, 2017, 8:19am

After fixing this date issue, my file was parsed completely. But I'm still dazzled with the fact that 'save as' file went previously through.

I'll now put all of my 800 000 000 logs again into Elastic. Hope this is it, because I've spend 2 weeks of searching and fixing this.

@jogoinar10:

 # The # character at the beginning of a line indicates a comment. Use
# comments to describe your configuration.
input {
	beats {
        port => "5044"
    }
}
# The filter part of this file is commented out to indicate that it is
# optional.
filter {
	mutate {
		gsub => ["message", "\|C3\|", "|cir=C3|"]
	}

#	kv {
#		field_split => "|"
#		include_brackets => false
#	}
	ruby {
		code => "
			a = event.get('message').split('|').delete_if{|x| !x.match(/=/)}
			a.each {|y| b = y.split('=', 2)
				event.set(b[0].strip, b[1])
			}
			event.set('acronym', event.get('acronym').upcase)"
	}
	mutate {
		gsub => ["date", " ", ";"]
		convert => {"type" => "integer"}
		convert => {"rptPackageStatus" => "integer"}
		add_field => {"country" => "si"}
	}
	date {
		locale => "en"
		match => ["date", "dd.MM.YYYY;HH:mm:ss"]
		timezone => "Europe/Ljubljana"
		target => "date"
	}
	date {
		locale => "en"
		match => ["returnDate", "dd.MM.YYYY"]
		timezone => "Europe/Ljubljana"
		target => "returnDate"
	}
	date {
		locale => "en"
		match => ["firstsignUpDate", "dd.MM.YYYY"]
		timezone => "Europe/Ljubljana"
		target => "firstsignUpDate"
	}
	date {
		locale => "en"
		match => ["lastVisitDate", "dd.MM.YYYY"]
		timezone => "Europe/Ljubljana"
		target => "lastVisitDate"
	}
	date {
		locale => "en"
		match => ["loanDate", "dd.MM.YYYY"]
		timezone => "Europe/Ljubljana"
		target => "loanDate"
	}
	date {
		locale => "en"
		match => ["lastProlongDate", "dd.MM.YYYY"]
		timezone => "Europe/Ljubljana"
		target => "lastProlongDate"
	}
	date {
		locale => "en"
		match => ["reservationDate", "dd.MM.YYYY"]
		timezone => "Europe/Ljubljana"
		target => "reservationDate"
	}
}
output {
	elasticsearch {
        hosts => [ "localhost:9200" ]
        index => "abc"
		document_type => "log_abc"
    }
#	stdout { codec => json }
#	stdout { codec => rubydebug }

}

Mojster · October 11, 2017, 10:22am

The previous solution fixed it. It was a charset problem from VAX system.

system · November 8, 2017, 10:23am

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.