Again this issue : Reached open files limit: 4095, set by the 'max_open_files' option or default, files yet to open: 4298

(samyo) #1

Hello folks,
i am using Elasticsearch and Logstash Version 6.3.1 , and Java version 8 , on Ubunto Virtualmachine , with 8GB RAM and 50GB for the Memory .
I am trying to index 8000 textfiles after filter them using Logstash , and send them to Elasticsearch. i did not install them as Service , i just run them normaly . does that make any Problems?????
Logstash could send just about 4000 of them, after that ,it shows me following:

´´´
Reached open files limit: 4095, set by the 'max_open_files' option or default, files yet to open: 4298
´´´

after long research , i changed as following:

the Heap size of elastic and logstash from the jvm file in both of them
´´´
´´
sudo sysctl -w vm.max_map_count=262144´
´´´
and i changed this:
´´´
ulimit -Sn = 63536
ulimit -Hn = 63536
´´´

but it stills did not work.
maybe i should increase LS_OPEN_FILES on logstash config in /etc/init.d/logstash

but i do not know how .

why is logstash always making Problem related to **reached-open-file-size**
Is that cause of Logstash it self , or is that cause of my Config file??????
my Config file includes just Input and a filter with 4 grock inside it to match 4 different things from the textfiles and at the end  the output , that is it .

Please any Explanation???????
why always this Problem on Logstash ???????? 
How to Lose it ?????, i have tried every things


thx.
#2

What does your input look like?

(samyo) #3

input
{
    file
    {
        codec => multiline
       {
            pattern => "^\s"
            negate => true
            what => previous
			charset => "UTF-8"	
        }
		path => "/home/sam/Desktop/folder2/*.txt"
         start_position => "beginning"
         sincedb_path => "/dev/null"
         exclude => "*.gz"
    }
}
#4

Firstly, your path options suggests you are running on Windows. If that is the case then your sincedb should be "NUL", not "/dev/null".

Secondly, you are running in tail mode, and as the documentation states the "plugin always assumes that there will be more content". So it opens the first 4095 files that match the path option, reads them, and then waits for data to be appended to them. If no data is appended to a given file after an hour (the default value of the close_older option) it will close it and open another file that matches the path. So this all appears to be working as expected.

I suggest you read the documentation. You might want to decrease close_older, or increase max_open_files, or even change the mode from tail to read.

(samyo) #5

sorry i change the path , i am still using Ubunto on Virtualmachine.
so you mean , that i have to change what => previous to read mode what => next ???

i just want from logstash to read all files exists from the "folder2" , and after finish matching them and send them to elasticsearch , it should still listining to any changes in "folder2" and so one...
That is all the idea about.

Logstash should still listining for always , it should not stop listining from the folder2, and after matching a file , it should send it to elastic , and close the matched file and continue listining.
but it seems logstash let all matched files opening.
How can i let logstash close those matched file ?????
is that possible ??
logstash will skip the files wich are not matched from my filter, is that right???
Is the Proble realy just cause of my Input ?? or is it dependent on my System and max-open-file-size ????

#6

Then you should probably change from tail mode to read mode. Make sure you understand the default value of file_completed_action.

(samyo) #7

i changed to read mode , but logstash still send 4000 files and shows me
Reached open files limit: 4095, set by the 'max_open_files' option or default, files yet to open: 4298

i do not know anymore , what i am doing wrong.
i even changed the open-file-size in system ,

(samyo) #8

i changed my input with this setting, and logstash could send 5100 files from 8000 instead of 4000 files , and after that logstash shows me following:

java.lang.OutOfMemoryError: Java heap space linux

what should i do now , should i increase the Java heap size? and how to do it on Linux??
is my Input looking wrong ???
Nevertheless with this new Input , could logstash send 1100 files more than bevor , any Suggestion??

my Input now look like this:

input {
file {
path => "/home/sam/Desktop/folder2/*.txt"
sincedb_path => "/dev/null"
start_position => "beginning"
stat_interval => 300
discover_interval => 1
ignore_older => 864000
close_older => 20
max_open_files => 102400
        codec => multiline
       {
            pattern => "^\s"
            negate => true
            what => previous		
        }
}
}
(samyo) #9

i think i know where the problem is . My Input seems not very confidence. it takes a lot of Size Capacity from the Java Heap Size Space. Cause Logstash did not close the unmatched or unreadable files i think, and still open and open other files and so one until reach the Heap Size.

Can anybody please tell me how to controll my Input , to not let him jump out of Heap Size. I think i am doing somthing wrong with "Codec" setting , or is there any thing else i am doing wrong with??

i even tried with the "read Mode" instead of "tail Mode" but still no big different , at Point it will reach the Java Heap Size Space .

Please any Idea to correct my Input , to let him dynamicly read all the files from the "folder2" and keep listining with out jumping out of Memory Size ...
i will be thankfull.

#10

You could try modifying jvm.options in /etc/logstash to change the heap size.

Your codec will accumulate lines that start with whitespace and append them to the previous line. That event will be kept in memory until a line that does not start with whitespace is appended to the file and causes the event to be flushed. In other words, the last event for every file is being kept in memory. Take a look at the auto_flush_interval on the codec.

(samyo) #11

so why logstash than jump out of Java Heap size Memory Space ??
I increased everything , the Heap size for Elastic and for Logstash in jvm.options file.
i increased the max-open-file from System like the Picture above.
I increased even the Java Heap size with: export _JAVA_OPTIONS="-Xmx6G"
But Logstash still jump out of Heap Size or reach max open file.

this Problem is very difficult, cause that means to me Logstash is not good enought to handle something like "sending 1000000 files into an Index in elastic" with out jumping out of Memory Size.

What should i do?
is my Input very bad so that it let Logstash jump out of Memory size always?

Can you please tell me , how to write or correct my Input , so it can dynamiclly read textfiles from folder and keep listining if any new textfile has been added ???

thx.

#12

As I said, take a look at the auto_flush_interval on the codec.

(samyo) #13

Hello Badger , i modified my Input and used the auto_flush_interval on Tail Mode, it works now fine , and did not jump out of Memory Size Space. But after reading all Files without Problems, i am getting now this WARNING,
Identity Map Codec has reached 80% capacity {:current size=>16775, :upper limit=>20000}

What does that mean? should i ignore this Warning?
Is it possible to increase the Capacity of Identity Map Codec ? and how ?

after sending about 19000 matched files , logstash crashed with this Error,

[2019-05-21T12:58:40,600][ERROR][logstash.codecs.multiline] IdentityMapCodec has reached 100% capacity {:current_size=>20000, :upper_limit=>20000}
[2019-05-21T12:58:41,993][ERROR][org.logstash.Logstash    ] java.lang.IllegalStateException: Logstash stopped processing because of an error: (IdentityMapUpperLimitException) LogStash::Codecs::IdentityMapCodec::IdentityMapUpperLimitException

Thx .

#14

I suggest you read this and follow some of the links from it. Does setting a lower value for close_older help?

(samyo) #15

ok i will see again. thx