Logstash: input file plugin and tcp plugins

hsalim · May 15, 2021, 3:51pm

sorry what do you mean by 'second file of the configuration'?

and what is your suggestion on path forward, if we want to run 4 parallel pipelines (3 with TCP input and 1 with file input), we cant do that?

appreciate your insights.

Badger · May 15, 2021, 4:03pm

I mean that if you have

"pipeline.sources"=>[
"/opt/elastic-release/logstash/pipeline/security1100/100-ingestion.conf",
"/opt/elastic-release/logstash/pipeline/security1100/102-date.conf",
"/opt/elastic-release/logstash/pipeline/security1100/200-tcapLookups.conf",
"/opt/elastic-release/logstash/pipeline/security1100/201-sccpLookups.conf",
"/opt/elastic-release/logstash/pipeline/security1100/202-gtaLookups.conf",
"/opt/elastic-release/logstash/pipeline/security1100/300-secMessagesLookup.conf", 
"/opt/elastic-release/logstash/pipeline/security1100/301-secMessages.conf", 
"/opt/elastic-release/logstash/pipeline/security1100/302-JanKOrdering.yml", 
"/opt/elastic-release/logstash/pipeline/security1100/400-fieldRemoval.conf", 
"/opt/elastic-release/logstash/pipeline/security1100/900-output.conf"],

then the error will occur for security1100/102-date.conf

hsalim · May 15, 2021, 4:29pm

ok. So i understand that this known issue is preventing me from running 4 parallel pipelines (3 with TCP input and 1 with file input)? is what you are suggesting is that the two are linked?

Badger · May 15, 2021, 5:33pm

No, I do not think they are linked. You said you only get that "Could not determine ID for filter/date" when the java execution engine is disabled, but the pipeline with the file input stops even when it is enabled.

hsalim · May 15, 2021, 10:29pm

do you have any insights on the pipeline issue i am encountering?

Badger · May 15, 2021, 10:38pm

No, sorry.

Andrea_Selva · May 18, 2021, 8:38am

Hi @hsalim we could try to reproduce it, but some information is needed to work on this.
Are you able to provide a minimal pipeline config that reproduces it? Maybe a pipeline with the minimal input filter and output that manifest the problem.
Which version of Logstash are you using?

hsalim · May 20, 2021, 6:27pm

Hi @Andrea_Selva issue we are having is files that are waiting to be ingested are not being marked as new_discovery, see below, any ideas?

[2021-05-20T18:20:02,794][TRACE][filewatch.discoverer     ] discover_files {"count"=>2254}
[2021-05-20T18:20:02,794][TRACE][filewatch.discoverer     ] discover_files handling: {"new discovery"=>false, "watched_file details"=>"<FileWatch::WatchedFile: @filename='202010260420-332388-subset.txt.gz', @state='ignored', @recent_states='[:watched, :watched]', @bytes_read='354', @bytes_unread='0', current_size='351', last_stat_size='351', file_open?='false', @initial=false, @sincedb_key='67116587 0 2058'>"}

Badger · May 20, 2021, 6:38pm

So it has read 354 bytes from a 351 byte file. That suggests inode reuse.

hsalim · May 20, 2021, 6:48pm

@Badger i am not using sincedb config is below, so not sure how inode reuse will impact me? and if for some weird reason it does what is the solution? and by the way this only happens on 3 node cluster not on a single node.


input {
  file {
    path => "/home/ftpuser/tdr/input/diameter/SYSTEM/local/*.gz"
    max_open_files => 8000
    mode => "read"
    start_position => "beginning"
    sincedb_path => "/dev/null"
   }
}

Badger · May 20, 2021, 7:03pm

The sincedb_path option determines whether the in-memory sincedb is persisted to disk across restarts. The in-memory db is always used.

Reliably determining whether a file has been read really requires maintaining a hash of the part of the file that has been read. That means re-hashing the file every time additional data is read, which would be extremely expensive. The file input uses a much cheaper technique which sometimes gets things wrong.

When in read mode and using file_completion_action delete, the problem could be solved by deleting sincedb entries as part of the file deletion process. If someone contributed a PR that implemented that I expect elastic would be happy to take a look at it.

hsalim · May 20, 2021, 7:13pm

Yes, i read up and now understand better role of sincedb.

so my files do not change once written to directory for ingestion. file_completion_action delete is active (as default). i am now testing with sincedb_clean_after 45 seconds. lets see...

any other ideas and suggestions are welcome.

system · June 17, 2021, 7:13pm

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Logstash pipeline is starting but not executing Logstash	7	3806	July 6, 2017
Logstash stuck Logstash	6	1794	May 21, 2020
Logstash stopped at the line pipeline main started Logstash	7	3186	May 12, 2017
Logstash not executing after pipeline main started Logstash	7	5478	July 6, 2017
Logstash: Problem regarding inout plugin, Logstash startup completed does not get dislayed Logstash	14	4394	July 6, 2017

Logstash: input file plugin and tcp plugins

Related topics