Logstash hangs before indexing the last log event

Hi,

My setup:
Elasticsearch 8.1.0, kibana 8.1.0, logstash 8.1.0

My logstash config:

input{
	file{
		path => "C:/Users/avisriva/Documents/Logs/TDW_9 Logs/logs_14082020.txt"
		codec => multiline{
			pattern => "Log Entry: "
			negate => true
			what => previous
		}
		start_position => ["beginning"]
	}
}
filter{
	grok{
		patterns_dir => ["./patterns"]
		match => ["message", "Log Entry: \r\nCurrent Date Time: %{CUST_TIMESTAMP:timestamp}%{INT:millisecond}\r\n\r\nLogMessage: %{DATA:msg}\r\n"]
	}
	date{
		match => ["timestamp", "dd MM yyyy HH:mm:ss.SSS"]
	}
	mutate{
		copy => {
			"timestamp" => "dateandtime"
		}
	}
	mutate{
		split => {
			"dateandtime" => " "
		}
		add_field => {
			"[@metadata][index_name]" => "tdw_error_logs-%{[dateandtime][2]}.%{[dateandtime][1]}.%{[dateandtime][0]}"
		}
	}
}
output{
	elasticsearch{
		hosts => "https://localhost:9200"
		user => "elastic"
		password => "d2L=pxAjFUH*SdL=Qy9b"
		ssl_certificate_verification => false
		index => "%{[@metadata][index_name]}"
	}
	stdout{
		codec => rubydebug
	}
}

I have a custom log like below:


Log Entry: 
Current Date Time: 14 08 2020 01:35:55.602

LogMessage: Effort File Upload Starts
-------------------------------

Log Entry: 
Current Date Time: 14 08 2020 01:35:56.160

LogMessage: Effort File Upload Ends
-------------------------------

Log Entry: 
Current Date Time: 14 08 2020 01:35:57.684

LogMessage: ImportValidateData method Starts
-------------------------------

Log Entry: 
Current Date Time: 14 08 2020 01:35:58.271

LogMessage: ImportValidateData method Ends
-------------------------------

Log Entry: 
Current Date Time: 14 08 2020 01:35:58.272

LogMessage: InsertTicketData method Starts
-------------------------------

Log Entry: 
Current Date Time: 14 08 2020 01:35:58.404

LogMessage: InsertTicketData method Ends
-------------------------------

Log Entry: 
Current Date Time: 14 08 2020 01:35:58.407

LogMessage: ValidateInputData method Starts
-------------------------------

Log Entry: 
Current Date Time: 14 08 2020 01:35:58.487

LogMessage: ValidateInputData method Ends
-------------------------------

Log Entry: 
Current Date Time: 14 08 2020 01:35:58.488

LogMessage: GetValidationSummary method Starts
-------------------------------

Log Entry: 
Current Date Time: 14 08 2020 01:35:58.570

LogMessage: GetValidationSummary method Ends
-------------------------------

Log Entry: 
Current Date Time: 14 08 2020 01:35:58.571

LogMessage: GetInvalidTicketData method Starts
-------------------------------

Log Entry: 
Current Date Time: 14 08 2020 01:35:58.618

LogMessage: GetInvalidTicketData method Ends
-------------------------------

As you can see in the log, there are 12 LogMessages.
Logstash is able to index 11 Logstash messages in Elasticsearch successfully. Also It gives one _grokparsefailure.
Below you can see one _grokparsefailure error.

{
            "log" => {
        "file" => {
            "path" => "C:/Users/avisriva/Documents/Logs/TDW_9 Logs/logs_14082020.txt"
        }
    },
       "@version" => "1",
        "message" => "Log Entry: \r\nCurrent Date Time: 14 08 2020 01:35:58.272\r\n\r\nLogMessage: InsertTicketData method Starts\r\n-------------------------------\r\n\r",
          "event" => {
        "original" => "Log Entry: \r\nCurrent Date Time: 14 08 2020 01:35:58.272\r\n\r\nLogMessage: InsertTicketData method Starts\r\n-------------------------------\r\n\r"
    },
           "tags" => [
        [0] "multiline"
    ],
    "dateandtime" => [
        [0] "14",
        [1] "08",
        [2] "2020",
        [3] "01:35:58.27"
    ],
     "@timestamp" => 2020-08-13T20:05:58.270Z,
    "millisecond" => "2",
      "timestamp" => "14 08 2020 01:35:58.27",
           "host" => {
        "name" => "LIN20002873"
    },
            "msg" => "InsertTicketData method Starts"
}
{
           "log" => {
        "file" => {
            "path" => "C:/Users/avisriva/Documents/Logs/TDW_9 Logs/logs_14082020.txt"
        }
    },
          "tags" => [
        [0] "_grokparsefailure"
    ],
    "@timestamp" => 2022-05-02T09:03:40.567386800Z,
      "@version" => "1",
       "message" => "\r",
          "host" => {
        "name" => "LIN20002873"
    },
         "event" => {
        "original" => "\r"
    }
}

But when I press CTRL+C in Windows cmd where logstash is running, the last LogMessage line is read and indexed into Elasticsearch successfully!!

As you can see below:

[2022-05-02T14:34:53,135][WARN ][logstash.runner          ] SIGINT received. Shutting down.
[2022-05-02T14:34:53,172][INFO ][filewatch.observingtail  ] QUIT - closing all files and shutting down.
{
            "log" => {
        "file" => {
            "path" => "C:/Users/avisriva/Documents/Logs/TDW_9 Logs/logs_14082020.txt"
        }
    },
       "@version" => "1",
        "message" => "Log Entry: \r\nCurrent Date Time: 14 08 2020 01:35:58.618\r\n\r\nLogMessage: GetInvalidTicketData method Ends\r\n-------------------------------\r",
          "event" => {
        "original" => "Log Entry: \r\nCurrent Date Time: 14 08 2020 01:35:58.618\r\n\r\nLogMessage: GetInvalidTicketData method Ends\r\n-------------------------------\r"
    },
           "tags" => [
        [0] "multiline"
    ],
    "dateandtime" => [
        [0] "14",
        [1] "08",
        [2] "2020",
        [3] "01:35:58.61"
    ],
     "@timestamp" => 2020-08-13T20:05:58.610Z,
    "millisecond" => "8",
      "timestamp" => "14 08 2020 01:35:58.61",
           "host" => {
        "name" => "LIN20002873"
    },
            "msg" => "GetInvalidTicketData method Ends"
}
[2022-05-02T14:34:54,262][INFO ][logstash.javapipeline    ][main] Pipeline terminated {"pipeline.id"=>"main"}
[2022-05-02T14:34:55,222][INFO ][logstash.runner          ] Logstash shut down.
Terminate batch job (Y/N)? Y

How to resolve this issue??

With that configuration the codec will flush an event every time it finds a line that contains "Log Entry: ". If there is a blank line at the start of the file then that will be flushed as an event. If there are lines at the end of the file following the last "Log Entry: " then logstash will wait forever for another "Log Entry: " to be added before flushing them.

You can use the auto_flush_interval option to set a timeout for this.

Yes, there is a new line at the start of the log file before LogMessage.
Thanks for your explanation I have understood that because of that newline at the start of the log file before LogMessage. I am getting below error which results in grokparsefailure.

"tags" => [
        [0] "_grokparsefailure"
    ],
    "@timestamp" => 2022-05-02T09:03:40.567386800Z,
      "@version" => "1",
       "message" => "\r",
          "host" => {
        "name" => "LIN20002873"
    },
         "event" => {
        "original" => "\r"
    }

As per your recommendation, I have added auto_flush_interval of 1 second. Like below:

codec => multiline{
			pattern => "Log Entry: "
			negate => true
			what => previous
			auto_flush_interval => 1
		}

Keeping all other logstash configuration same as earlier.

Now I am able to successfully index all the 12 events successfully in Elasticsearch.
But, still logstash is stuck and is not ending by itself.
Also, I am still getting the _grokparsefailure as earlier.

logstash with a file input will wait forever for new files or data to be appended to old files.

Ok, then how do I remove the grokparsefailure error? Any idea?

Set tag_on_failure => [] on the grok filter.

Thanks for the response, but adding tag_on_failure => [], still index the event with _grokparsefailure. What it does, is that it's just set the value of the field tags to empty string.

That didn't solve my problem. So I added an if condition in the grok filter as below:

if "_grokparsefailure" in [tags]{
		drop{}
	}

and the issue was resolved.