dentityMapCodec has reached 100% capacity {:current_size=>20000, :upper_limit=>20000} warning: thread "[main]<file" terminated with exception (report_on_exception is true):

Hi Team, Community
A.Error
Need your assistance in this error im getting: dentityMapCodec has reached 100% capacity {:current_size=>20000, :upper_limit=>20000} warning: thread "[main]<file" terminated with exception (report_on_exception is true):

i searched this error in the forum but didnt seem to find the correction. one tried is to incrae the max_open _files but didnt resolve the issue.

As you can see below I have 2 path, as the csv is pulled from different drive within my the local system(windows) .

B. Troubleshooting/Investigation

  1. If i try to run the with type = ABC ONLY.. it works fine. But this is because this directory only have few files in it.

  2. NOW if i add type = XYZ , which has several files this is where i face issue.

Seems the issue is only related to the volume in the directory as folder XYZ has several csv in it compare to ABC. Otherwise , the conf works fine.
How to resolve this please? am i missing something my config to handle directories with several files?

  1. Code is as per below:
    input{
    file{
    type => "ABC"
    path => "D:/PATH1/ABC/*/csv"
    start_position => "beginning"
    sincedb_path => "NUL"
    max_open_files => 102400
    }
    file{
    type => "XYZ"
    path => "C:/PATH2/XYZ/
    /*csv"
    start_position => "beginning"
    sincedb_path => "NUL"
    max_open_files => 102400
    }
    }

filter{
if [type] == "XYZ" or [type] == "XYZ" {
csv{
separator => ";"
columns => [ "Name","IdNo", "Email"]
}
}
}

output{
elasticsearch{
hosts => ["https://someURLhereXYZ: 9243"]
index => "index0001"
user => "elastic"
password => "xxxxxxxxx"
}
}

br, borgee

Wow, that's big. You are asking the file input to read 102400 files in parallel. That is not a reasonable request.

The multiline codec has a class that does mapping, which has a limit of 20,000 entries in the map. That class is re-used by other inputs and codecs , including the file input (and should be re-factored).

The default limit on open files is 4096, and that's really big. The file input does not stop tailing a file if it has to close a file handle, it just remembers where it left off and comes back to it when it has a chance.

Is there really a reason to set max_open_files so large?

Hi Badger, Thanks, for the reply.

The reason why i increased it is because I got this error:
[2021-05-16T04:50:21,556][WARN ][filewatch.tailmode.processor][main][780d740ea5698b8844a3826eda01d834d9575cb13b03f16edb21493dad97c6a3] Reached open files limit: 4095, set by the 'max_open_files' option or default, files yet to open: 17545
[2021-05-16T04:50:24,217][WARN ][logstash.filter

what should be the value to set here?

also how to fix the [dentityMapCodec has reached 100% capacity] , error?
will setting close_older help?

OK, so that seems like a logical response to that error message, but I think you overdid it. Instead of increasing max_open_files I would suggest you ignore the 'Reached open files limit' error. If you do not want do ignore it then increase max_open_files but keep it below 19900.

Logstash cannot possibly tail 10,000 files at the same time. Opening and closing them as it has a chance is likely more efficient than trying to do so.

Reducing max_open_files should eliminate the error. Reducing close_older will help it, but understand that closing and re-opening files when they change has a cost that logstash pays.

If you really need to read tens of thousands of files in a short time then that is a really tough use-case for any system.

logstash is a great tool, but it is not magic.

Thank you Badger. Yea seems to do the work and for now let tme try and observe how it goes. I can see the that latest file is visible in the discover.

Strangely though im seeing this - :exception=>#<CSV::MalformedCSVError: Illegal quoting in line 1.>}

I dont know if this is related to that fact i put close_older..

any idea? if not related to the issue i reported jsut now il just skim through the threads or open new topic.

this is something thats not happening in my local machine although in my local i only have very less file.

as per whether the file will change, im not sure that will be a problem because the files im logging are files that are is already archived. so no user will need to open or change anything in those file.