Updating the dictionary file used in translate filter while logs are being processed

dmalhan · January 9, 2020, 9:34pm

Currently the Logstash instance that we're running basically runs 24 hours a day (processes millions of logs everyday). It's using a file as a mapping dictionary through a Translate filter with the default refresh interval. The docs here mention a default rate for refreshing the file and that's what we're using in the configuration file.

My question is regarding the architecture of the refresh process: If logs are continuously being processed, does it mean the file being used for mapping is under lock and being read continuously or is it read into memory every refresh interval and that's what's used for mapping?

My use case is that the file will be updated on a weekly basis but the process will only be able to update the file if it's not being read by another process (i.e. Logstash in this case). Any help is appreciated!

Badger · January 9, 2020, 10:28pm

The file is read into a hash every refresh interval.

dmalhan · February 3, 2020, 7:28pm

Thanks @Badger, you mean the file contents itself? So then if the file is being updated it doesn't matter because the previous contents are already in a hash in-memory right?

Badger · February 3, 2020, 7:37pm

I believe that is correct.

dmalhan · February 4, 2020, 8:39pm

That makes sense, thanks! Would you know why I could be running into the following error:

[main] Pipeline aborted due to error {:pipeline_id=>"main", :exception=>#<LogStash::Filters::Dictionary::DictionaryFileError: Translate: Unquoted fields do not allow \r or \n (line 1). when loading dictionary file at ...

The CSV looks like this:

KEY,ID
"GDC123","000355"
"GDC154","000355"
"GDC165","000355"
"GDC1786","018265"
"GDC1987","005543"

Badger · February 4, 2020, 8:58pm

A Windows format text file on a UNIX machine?

dmalhan · February 4, 2020, 9:27pm

What do you mean, isn't that the normal way a csv is defined? We tried it without the double quotes and it's the same exact error which is why we tried with the quotes.

Badger · February 4, 2020, 9:59pm

I mean a file that uses \r\n as a line ending on a machine that uses \n as a line ending. That would result in the ruby CSV parser seeing a trailing \r on a field.

dmalhan · February 4, 2020, 10:00pm

It only has \n at the end of each line which is why I'm confused. What you're saying makes sense though and I wish that was it.

system · March 3, 2020, 10:00pm

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Translate filter dictionary update Logstash	2	392	April 8, 2021
Translate filter plugin not reading updated file Logstash	4	789	February 26, 2018
Logstash translate plugin dynamic dictionary path Logstash	4	2450	May 28, 2018
Translate Dictionary Path Updating of Changes Logstash	2	568	July 6, 2017
[LOGSTASH] logstash-filter-translate plugin Logstash	1	282	May 7, 2019

Updating the dictionary file used in translate filter while logs are being processed

Related topics