leandrojmp
(Leandro Pereira)
August 27, 2023, 5:37pm
22
There are multiple errors.
First, if you have the compressed files in the same directory, you need to add a exclude line in your input.
For example, add this line inside the file
input.
exclude => "*.bz2"
Second, your elasticsearch disk is full:
:error=>{"type"=>"cluster_block_exception", "reason"=>"index [jira-access-log-2023.08.27] blocked by: [TOO_MANY_REQUESTS/12/disk usage exceeded flood-stage watermark , index has read-only-allow-delete block];"}}
You need to free up some space in your elasticsearch node, it cannot write anything now.
Badger
August 27, 2023, 6:10pm
23
Perhaps you can avoid reading the zip files by using a pattern like /path/to/the/access_log.[-0-9]{10}
danmed
(dan)
August 27, 2023, 6:39pm
24
Thanks leandrojmp, I added the exclude like you said, but I still see it processing .bz2 files after I restarted LS.
input {
file {
path => "/path/to/access_log.*"
start_position => "beginning"
exclude => "*.bz2"
}
}
Additionally, I am using this dissect filter below which seems to work as welll..
filter {
dissect {
mapping => { "message" => "%{ip} %{id} %{user} [%{[@metadata][timestamp]} %{timezone}] %{message}" }
}
date {
match => [ "[@metadata][timestamp]", "dd-MMM-yyyy:HH:mm:ss" ]
target => "@timestamp"
}
}
But even with it, the problem of it processing the .bz2 file remains.
Is there anything else I need to do to not process compressed files?
Thanks
danmed
(dan)
August 27, 2023, 6:43pm
25
Thanks Badger, I will try it out.
danmed
(dan)
August 27, 2023, 6:53pm
26
Badger, do you mean /path/to/access_log.????-??-??
This way I can get:
ls -l /path/to/access_log.????-??-??
-rw-r----- 1 <user> <group> 182208 Aug 27 20:48 /path/to/log/access_log.2023-08-27
or would that not work?
Badger
August 27, 2023, 6:59pm
27
That is another regexp that will mostly have the same effect.
danmed
(dan)
August 27, 2023, 7:12pm
28
yes that is working, the file is being processed with the '?'s .
Now to get the filter right, because
I get an error in my date parsing .
See tags below.
I could try switching to grok again, which I would prefer... But here's the error when I use dissect
[2023-08-27T20:56:35,321][INFO ][logstash.outputs.opensearch][jira-acc-pipeline][06c5861d0b1e2b5a4752a90d08ee0a042812fa2e1863bae9615aa42c05232fda] Retrying failed action {:status=>429, :action=>["index", {:_id=>nil, :_index=>"access_log-new2-2023.08.27", :routing=>nil}, {"@timestamp"=>2023-08-27T18:56:02.010709461Z, "message"=>"\"GET /rest/api/lates....................................etc............ HTTP/1.0\" 404 54 11 \"-\" \"Atlassian HttpClient 0.23.0 / Atlassian JIRA Rest Java Client-4.0.3-sc (0) / Default\" \"xyz1234\"", "@version"=>"1", "log"=>{"file"=>{"path"=>"/path/to/access_log.2023-08-27"}}, "host"=>{"name"=>"myhost.com"}, "event"=>{"original"=>"nnn.nnn.nnn.nn 123x234x567x123 some.user.name [27/Aug/2023:11:00:21 +0200] \"GET /rest/api/latest...........................etc.................... HTTP/1.0\" 404 54 11 \"-\" \"Atlassian HttpClient 0.23.0 / Atlassian JIRA Rest Java Client-4.0.3-sc (0) / Default\" \"xyz1234\""}, "tags"=>["_dateparsefailure"], "ip"=>"nnn.nnn.nnn.nn", "user"=>"some.user.name", "id"=>"123x234x567x123, "timezone"=>"+0200"}], :error=>{"type"=>"cluster_block_exception", "reason"=>"index [access_log-new2-2023.08.27] blocked by: [TOO_MANY_REQUESTS/12/disk usage exceeded flood-stage watermark, index has read-only-allow-delete block];"}}
plus I don't know why I have the disk usage issue, I have enough space. I could try to reset it's disk usage params for this particular index, I don't know.
danmed
(dan)
August 27, 2023, 7:45pm
29
I think I managed to parse the date / time. Because there is no date parsing failure.
But,
I still get the disk issue though.
The disk on which logstash stores it's logs and where it's installed has plenty of space left.
What exactly does is mean then? Is it talking of disk space on the target opensearch server's disk?
Here's another log line below. The timestamp seems ok, but the cluster_block_exception is the error:
[2023-08-27T21:28:54,683][INFO ][logstash.outputs.opensearch][jira-acc-pipeline][b1bc0ac19696818e24519653a5a52c38186c3ca52c116d9b501783e7d6aa28db] Retrying failed action {:status=>429, :action=>["index", {:_id=>nil, :_index=>"jira-access-new3-2023.08.27", :routing=>nil}, {"@timestamp"=>2023-08-27T19:28:50.000Z, "log"=>{"file"=>{"path"=>"/path/to/access_log.2023-08-27"}}, "@version"=>"1", "message"=>"\"GET /rest/api/2/searc..............etc.....Results=2000 HTTP/1.0\" 200 53 17 \"-\" \"Java/1.8.0_45\" \"xyz1234\"", "event"=>{"original"=>"nnn.nnn.nnn.nn 1234x123445x1 some.user.name [27/Aug/2023:21:28:50 +0200] \"GET /rest/api/2/search..............................etc.................Results=2000 HTTP/1.0\" 200 53 17 \"-\" \"Java/1.8.0_45\" \"xyz1234\""}, "ip"=>"nnn.nnn.nnn.nn", "host"=>{"name"=>"myhost.com"}, "id"=>"1234x123445x1", "user"=>"some.user.name"}], :error=>{"type"=>"cluster_block_exception", "reason"=>"index [jira-access-new3-2023.08.27] blocked by: [TOO_MANY_REQUESTS/12/disk usage exceeded flood-stage watermark, index has read-only-allow-delete block];"}}
Thanks.
Badger
August 27, 2023, 7:57pm
30
Yes. I don't know opensearch, but I believe when I was using elasticsearch that once ES had stopped ingesting data due to a disk space issue it was not enough to free up disk space, you needed to tell ES that you had done so.
This thread has got you to the point where logstash is reading the right logs from a file input. You should probably start a new thread if you have filter issues.
danmed
(dan)
August 27, 2023, 8:04pm
31
Thanks Badger.
I think the file input issue has been resolved.
Even the filter issue I think is resolved because :
the only error I see is for the disk space - and if that is for the target server then I can close this issue for this thread here.
I fully appreciate all the help from everyone.
Thanks again
system
(system)
Closed
September 24, 2023, 8:05pm
32
This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.