Infinite loop in ES, possible bug, please comment

Christopher_Curzon · January 11, 2016, 8:31pm

So I'm using Logstash to load a csv file having 4998079 records in it. As it is loading I'm tracking the count in Elasticsearch and I notice that the count goes well beyond 5M and keeps climbing.

Then I notice that the console shows the following message:

A plugin had an unrecoverable error. Will restart this plugin.

  Plugin: <LogStash::Inputs::File path=>["/home/zed/logstash-2.1.0/source_data/nga/AIS_Ships_PD-3874.csv"], sincedb_path=>"/home/zed/logstash-2.1.0/sincedb_path/AIS_Ships_PD-3874.since", type=>"core2", start_position=>"beginning", codec=><LogStash::Codecs::Plain charset=>"UTF-8">, stat_interval=>1, discover_interval=>15, sincedb_write_interval=>15, delimiter=>"\n">
  Error: No such file or directory - /home/zed/logstash-2.1.0/sincedb_path/AIS_Ships_PD-3874.since.14300.17459.522421 {:level=>:error}

Which explains that the load process has restarted from the beginning of the file, and that is why the record count continues to climb.

So out of curiosity, I wait to see if it will halt at the end of the second load. But ES issues the same fatal error message, and then restarts the data load for the third time. So I killed the logstash process.

The problem, which is clear from the error message, is that the sincedb_path value specified a path which did not exist.

The consequence is that this sets up an infinite loop. Not good.

My thought is that ES should check path and permissions to verify that the sincedb file can be created BEFORE starting the load.

Is there any reason why the file plugin ES (or Logstash) does not do this?

-- Chris curzon

warkolm · January 12, 2016, 2:57am

That's an LS error, not an ES one though?

Christopher_Curzon · January 12, 2016, 8:33pm

Yes. I would agree with you. It's a logstash error.

So, do you think this would be an error in the file plug-in? Or is it a feature based on some kind of optimization question.

An honest question. My experience suggests that, when one starts working with a system as complex and configurable as LS or ES, one has to decide how much trust to place in the system defaults. And what that usually means is that system defaults are not necessarily bugs, but instead are optimizations for a different use case. (So is there a use case for LS not to check the validity of the sincedb_path value?)

It's sort of like in a relational database the decision of whether or not to have an index. If you want all or most of the records in a table, then you don't want to use the index, but if you're fetching one a just a few records, then you do. And in Oracle or any other relational database, the optimizer usually make a decent decision on whether to use, or not.

But in LS or ES, there is no optimizer. (Is there?) So in the back of my mind is the possibility that I will--fairly easily--find some dark corner where something untoward happens.

Your thoughts?

warkolm · January 12, 2016, 9:38pm

Why does this happen? Does the file get created at all? Does the directory exist? Does the LS user have the permissions to write to this directory?

Christopher_Curzon · January 12, 2016, 10:12pm

The failure was because the directory didn't exist.

The file system had "sample_data" in the path.

But the config file had "sample_date" specifie as the path to use.

So the directory required by the config file did not exist.

I'm just wondering why LS did not check this fact before starting the processing.

Christopher_Curzon · January 12, 2016, 10:18pm

I just did a test. In the config file I specified

input {
file {
path => "/home/zed/logstash-2.1.0/source_data/gdelt/20151124.export.CSV"
sincedb_path => "/homex/zed/logstash-2.1.0/load_conf/gdelt.since"
type => "gdelt"
start_position => "beginning"
}
}

notice I gave /homex/zed.... as the sincedb_path value. This path doesn't exist.

But configtest, says this configuration is OK

$ bin/logstash -f load_conf/gdelt01.conf --configtest

Configuration OK

So I have to be very careful in this regard.

warkolm · January 12, 2016, 10:27pm

Feel free to raise an issue on the file input repo on this, it does seem this should be checked.

Christopher_Curzon · January 13, 2016, 5:51pm

I will do as you suggest.

As a newbie, can you tell me how? Or are there instructions for me to follow?

Thanks.

warkolm · January 13, 2016, 9:14pm

Sure, just create a new issue here - https://github.com/logstash-plugins/logstash-input-file/issues

Topic		Replies	Views
Problem sincedb_path Logstash	2	301	July 14, 2020
Logstash stuck at Successfully started Logstash API Logstash	4	1577	May 30, 2019
Sincedb file is being created empty Logstash	18	1637	June 28, 2022
Getting error for sincedb_path in window 7 Logstash	5	4892	March 19, 2019
Can we change sincedb different location for different input? Logstash	14	1743	May 10, 2018

Infinite loop in ES, possible bug, please comment

Related topics