Elastic stack - get last message which fails on logstash

I'm indexing custom longs into Logstash.
In some file I'm getting timeouts & I can't determine which line is causing them.
As I can se, Filebeats is shipping them in batches.

Is there any option to get this single message from LS.
I didn't get any further with stdout codec and json/rubydebug applied.

My error, if it's any additional help:

2017-09-18T14:16:01+02:00 ERR Failed to publish events caused by: read tcp [::1]:53202->[::1]:5044: i/o timeout
2017-09-18T14:16:01+02:00 INFO Error publishing events (retrying): read tcp [::1]:53202->[::1]:5044: i/o timeout

It jams LS so much that I have to restart the whole server.

Here's the error message from Logstash:

[2017-09-19T07:49:37,099][ERROR][logstash.outputs.elasticsearch] Encountered an unexpected error submitting a bulk request! Will retry. {:error_message=>"incompatible encodings: Windows-1250 and UTF-8", :class=>"Encoding::CompatibilityError", :backtrace=>["e:/logstash/vendor/bundle/jruby/1.9/gems/logstash-output-elasticsearch-7.4.0-java/lib/logstash/outputs/elasticsearch/common.rb:149:in `submit'", "org/jruby/RubyArray.java:1613:in `each'", "org/jruby/RubyEnumerable.java:971:in `each_with_index'", "e:/logstash/vendor/bundle/jruby/1.9/gems/logstash-output-elasticsearch-7.4.0-java/lib/logstash/outputs/elasticsearch/common.rb:127:in `submit'", "e:/logstash/vendor/bundle/jruby/1.9/gems/logstash-output-elasticsearch-7.4.0-java/lib/logstash/outputs/elasticsearch/common.rb:87:in `retrying_submit'", "e:/logstash/vendor/bundle/jruby/1.9/gems/logstash-output-elasticsearch-7.4.0-java/lib/logstash/outputs/elasticsearch/common.rb:38:in `multi_receive'", "e:/logstash/logstash-core/lib/logstash/output_delegator_strategies/shared.rb:13:in `multi_receive'", "e:/logstash/logstash-core/lib/logstash/output_delegator.rb:49:in `multi_receive'", "e:/logstash/logstash-core/lib/logstash/pipeline.rb:436:in `output_batch'", "org/jruby/RubyHash.java:1342:in `each'", "e:/logstash/logstash-core/lib/logstash/pipeline.rb:435:in `output_batch'", "e:/logstash/logstash-core/lib/logstash/pipeline.rb:381:in `worker_loop'", "e:/logstash/logstash-core/lib/logstash/pipeline.rb:342:in `start_workers'"]}

I'm currently running pipeline.workers:1, pipeline.batch.size:1 and log.level: trace and I still can't pinpoint a line with errors. This file is generated on OpenVMS system and pushed to Windows over FTP/SFTP. If I open it and do a save as, it's working.
Other files are working fine and there generated the same way.
I want to pinpoint the row so we can investigate further on VMS.

What is the version of your logstash?

Latest: 5.6.0
This was also a problem on 5.5.2. Didn't notice it earlier.

I found a row, which I'm assuming. Because there's a date wrong written.
But if this is the case it should not throw an error like this.

Can you share your logstash config?

What's in the Elasticsearch logs at that time?

Right now after pinpointing LS batch size to 1. Ive managed to find in logs, that last processed log contained:
34853632|note=C3|patronCategory=006|lastVisitDate=6.I.(1.ša |libraryDept=00|fir

After fixing this date issue, my file was parsed completely. But I'm still dazzled with the fact that 'save as' file went previously through.

I'll now put all of my 800 000 000 logs again into Elastic. Hope this is it, because I've spend 2 weeks of searching and fixing this.

@jogoinar10:

 # The # character at the beginning of a line indicates a comment. Use
# comments to describe your configuration.
input {
	beats {
        port => "5044"
    }
}
# The filter part of this file is commented out to indicate that it is
# optional.
filter {
	mutate {
		gsub => ["message", "\|C3\|", "|cir=C3|"]
	}

#	kv {
#		field_split => "|"
#		include_brackets => false
#	}
	ruby {
		code => "
			a = event.get('message').split('|').delete_if{|x| !x.match(/=/)}
			a.each {|y| b = y.split('=', 2)
				event.set(b[0].strip, b[1])
			}
			event.set('acronym', event.get('acronym').upcase)"
	}
	mutate {
		gsub => ["date", " ", ";"]
		convert => {"type" => "integer"}
		convert => {"rptPackageStatus" => "integer"}
		add_field => {"country" => "si"}
	}
	date {
		locale => "en"
		match => ["date", "dd.MM.YYYY;HH:mm:ss"]
		timezone => "Europe/Ljubljana"
		target => "date"
	}
	date {
		locale => "en"
		match => ["returnDate", "dd.MM.YYYY"]
		timezone => "Europe/Ljubljana"
		target => "returnDate"
	}
	date {
		locale => "en"
		match => ["firstsignUpDate", "dd.MM.YYYY"]
		timezone => "Europe/Ljubljana"
		target => "firstsignUpDate"
	}
	date {
		locale => "en"
		match => ["lastVisitDate", "dd.MM.YYYY"]
		timezone => "Europe/Ljubljana"
		target => "lastVisitDate"
	}
	date {
		locale => "en"
		match => ["loanDate", "dd.MM.YYYY"]
		timezone => "Europe/Ljubljana"
		target => "loanDate"
	}
	date {
		locale => "en"
		match => ["lastProlongDate", "dd.MM.YYYY"]
		timezone => "Europe/Ljubljana"
		target => "lastProlongDate"
	}
	date {
		locale => "en"
		match => ["reservationDate", "dd.MM.YYYY"]
		timezone => "Europe/Ljubljana"
		target => "reservationDate"
	}
}
output {
	elasticsearch {
        hosts => [ "localhost:9200" ]
        index => "abc"
		document_type => "log_abc"
    }
#	stdout { codec => json }
#	stdout { codec => rubydebug }

}

The previous solution fixed it. It was a charset problem from VAX system.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.