Hi Team, Community A.Error
Need your assistance in this error im getting: dentityMapCodec has reached 100% capacity {:current_size=>20000, :upper_limit=>20000} warning: thread "[main]<file" terminated with exception (report_on_exception is true):
i searched this error in the forum but didnt seem to find the correction. one tried is to incrae the max_open _files but didnt resolve the issue.
As you can see below I have 2 path, as the csv is pulled from different drive within my the local system(windows) .
B. Troubleshooting/Investigation
If i try to run the with type = ABC ONLY.. it works fine. But this is because this directory only have few files in it.
NOW if i add type = XYZ , which has several files this is where i face issue.
Seems the issue is only related to the volume in the directory as folder XYZ has several csv in it compare to ABC. Otherwise , the conf works fine.
How to resolve this please? am i missing something my config to handle directories with several files?
Code is as per below:
input{
file{
type => "ABC"
path => "D:/PATH1/ABC/*/csv"
start_position => "beginning"
sincedb_path => "NUL"
max_open_files => 102400
}
file{
type => "XYZ"
path => "C:/PATH2/XYZ//*csv"
start_position => "beginning"
sincedb_path => "NUL"
max_open_files => 102400
}
}
Wow, that's big. You are asking the file input to read 102400 files in parallel. That is not a reasonable request.
The multiline codec has a class that does mapping, which has a limit of 20,000 entries in the map. That class is re-used by other inputs and codecs , including the file input (and should be re-factored).
The default limit on open files is 4096, and that's really big. The file input does not stop tailing a file if it has to close a file handle, it just remembers where it left off and comes back to it when it has a chance.
Is there really a reason to set max_open_files so large?
The reason why i increased it is because I got this error:
[2021-05-16T04:50:21,556][WARN ][filewatch.tailmode.processor][main][780d740ea5698b8844a3826eda01d834d9575cb13b03f16edb21493dad97c6a3] Reached open files limit: 4095, set by the 'max_open_files' option or default, files yet to open: 17545
[2021-05-16T04:50:24,217][WARN ][logstash.filter
what should be the value to set here?
also how to fix the [dentityMapCodec has reached 100% capacity] , error?
will setting close_older help?
OK, so that seems like a logical response to that error message, but I think you overdid it. Instead of increasing max_open_files I would suggest you ignore the 'Reached open files limit' error. If you do not want do ignore it then increase max_open_files but keep it below 19900.
Logstash cannot possibly tail 10,000 files at the same time. Opening and closing them as it has a chance is likely more efficient than trying to do so.
Reducing max_open_files should eliminate the error. Reducing close_older will help it, but understand that closing and re-opening files when they change has a cost that logstash pays.
If you really need to read tens of thousands of files in a short time then that is a really tough use-case for any system.
Thank you Badger. Yea seems to do the work and for now let tme try and observe how it goes. I can see the that latest file is visible in the discover.
Strangely though im seeing this - :exception=>#<CSV::MalformedCSVError: Illegal quoting in line 1.>}
I dont know if this is related to that fact i put close_older..
any idea? if not related to the issue i reported jsut now il just skim through the threads or open new topic.
this is something thats not happening in my local machine although in my local i only have very less file.
as per whether the file will change, im not sure that will be a problem because the files im logging are files that are is already archived. so no user will need to open or change anything in those file.
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.