Logstash Value too large to output

Hello, I have new core app that I need to parse.
This app has strange big message output field and I need somehow add it to ELK.
Log pattern:

            grok {
                match => [ "message", "%{DATA:[event][time]} \[%{DATA:[event][thread]}\] %{DATA:[event][level]} TenantId: %{DATA:[tenant][id]}( ClientId: %{DATA:[client][id]})?( SessionId: %{DATA:[session][id]})?( RequestId: %{DATA:[request][id]})? Method: %{DATA:[request][method]} %{DATA:[request][url]}( StatusCode: %{DATA:[response][status][code]})? Request Headers: %{DATA:[request][headers]} Response Headers: %{DATA:[response][headers]} Message: %{GREEDYDATA:[response][message]}" ]
			}

Logstash give me WARN
Timeout executing grok '%{DATA:[event][time]} [%{DATA:[event][thread]}] %{DATA:[event][level]} TenantId: %{DATA:[tenant][id]}( ClientId: %{DATA:[client][id]})?( SessionId: %{DATA:[session][id]})?( RequestId: %{DATA:[request][id]})? Method: %{DATA:[request][method]} %{DATA:[request][url]}( StatusCode: %{DATA:[response][status][code]})? Request Headers: %{DATA:[request][headers]} Response Headers: %{DATA:[response][headers]} Message: %{GREEDYDATA:[response][message]}' against field 'message' with value 'Value too large to output (9371 bytes)! First 255 chars are: begin of log

Last "message" field has a lot of line with error information (text), so I can't parse this. Can you give me some advise? I don't need to parse latest message, just need to put in like as is it in output

log example:

Summary

19721064 [(null)] ERROR
TenantId: (null)
ClientId: (null)
SessionId: (null)
UserId: (null)
RequestId: (null)
Method: (null) (null)
StatusCode: (null)
Request Headers:
(null)
Response Headers:
(null)
Message: An exception occurred while iterating over the results of a query for context type 'name.CC.Dal.Contexts.nameSystemContext'.
Microsoft.Data.SqlClient.SqlException (0x80131904): Connection Timeout Expired. The timeout period elapsed during the post-login phase. The connection could have timed out while waiting for server to complete the login process and respond; Or it could have timed out while attempting to create multiple active connections. The duration spent while attempting to connect to this server was - [Pre-Login] initialization=15415; handshake=41566; [Login] initialization=0; authentication=0; [Post-Login] complete=3233;
---> System.ComponentModel.Win32Exception (258): Unknown error 258
at Microsoft.Data.SqlClient.TdsParser.ThrowExceptionAndWarning(TdsParserStateObject stateObj, Boolean callerHasConnectionLock, Boolean asyncClose)
at Microsoft.Data.SqlClient.TdsParserStateObject.ThrowExceptionAndWarning(Boolean callerHasConnectionLock, Boolean asyncClose)
at Microsoft.Data.SqlClient.TdsParserStateObject.ReadSniError(TdsParserStateObject stateObj, UInt32 error)
at Microsoft.Data.SqlClient.TdsParserStateObject.ReadSniSyncOverAsync()
at Microsoft.Data.SqlClient.TdsParserStateObject.TryReadNetworkPacket()
at Microsoft.Data.SqlClient.TdsParserStateObject.TryPrepareBuffer()
at Microsoft.Data.SqlClient.TdsParserStateObject.TryReadByte(Byte& value)
at Microsoft.Data.SqlClient.TdsParser.TryRun(RunBehavior runBehavior, SqlCommand cmdHandler, SqlDataReader dataStream, BulkCopySimpleResultSet bulkCopyHandler, TdsParserStateObject stateObj, Boolean& dataReady)
at Microsoft.Data.SqlClient.TdsParser.Run(RunBehavior runBehavior, SqlCommand cmdHandler, SqlDataReader dataStream, BulkCopySimpleResultSet bulkCopyHandler, TdsParserStateObject stateObj)
at Microsoft.Data.SqlClient.SqlInternalConnectionTds.CompleteLogin(Boolean enlistOK)
at Microsoft.Data.SqlClient.SqlInternalConnectionTds.AttemptOneLogin(ServerInfo serverInfo, String newPassword, SecureString newSecurePassword, Boolean ignoreSniOpenTimeout, TimeoutTimer timeout, Boolean withFailover)
at Microsoft.Data.SqlClient.SqlInternalConnectionTds.LoginNoFailover(ServerInfo serverInfo, String newPassword, SecureString newSecurePassword, Boolean redirectedUserInstance, SqlConnectionString connectionOptions, SqlCredential credential, TimeoutTimer timeout)
at Microsoft.Data.SqlClient.SqlInternalConnectionTds.OpenLoginEnlist(TimeoutTimer timeout, SqlConnectionString connectionOptions, SqlCredential credential, String newPassword, SecureString newSecurePassword, Boolean redirectedUserInstance)
at Microsoft.Data.SqlClient.SqlInternalConnectionTds..ctor(DbConnectionPoolIdentity identity, SqlConnectionString connectionOptions, SqlCredential credential, Object providerInfo, String newPassword, SecureString newSecurePassword, Boolean redirectedUserInstance, SqlConnectionString userConnectionOptions, SessionData reconnectSessionData, Boolean applyTransientFaultHandling, String accessToken, DbConnectionPool pool)
at Microsoft.Data.SqlClient.SqlConnectionFactory.CreateConnection(DbConnectionOptions options, DbConnectionPoolKey poolKey, Object poolGroupProviderInfo, DbConnectionPool pool, DbConnection owningConnection, DbConnectionOptions userOptions)
at Microsoft.Data.ProviderBase.DbConnectionFactory.CreateNonPooledConnection(DbConnection owningConnection, DbConnectionPoolGroup poolGroup, DbConnectionOptions userOptions)
at Microsoft.Data.ProviderBase.DbConnectionFactory.TryGetConnection(DbConnection owningConnection, TaskCompletionSource1 retry, DbConnectionOptions userOptions, DbConnectionInternal oldConnection, DbConnectionInternal& connection) at Microsoft.Data.ProviderBase.DbConnectionInternal.TryOpenConnectionInternal(DbConnection outerConnection, DbConnectionFactory connectionFactory, TaskCompletionSource1 retry, DbConnectionOptions userOptions)
at Microsoft.Data.SqlClient.SqlConnection.TryOpen(TaskCompletionSource1 retry, SqlConnectionOverrides overrides) at Microsoft.Data.SqlClient.SqlConnection.Open(SqlConnectionOverrides overrides) at Microsoft.EntityFrameworkCore.Storage.RelationalConnection.OpenInternal(Boolean errorsExpected) at Microsoft.EntityFrameworkCore.Storage.RelationalConnection.Open(Boolean errorsExpected) at Microsoft.EntityFrameworkCore.Storage.RelationalCommand.ExecuteReader(RelationalCommandParameterObject parameterObject) at Microsoft.EntityFrameworkCore.Query.Internal.SingleQueryingEnumerable1.Enumerator.InitializeReader(Enumerator enumerator)
at Microsoft.EntityFrameworkCore.SqlServer.Storage.Internal.SqlServerExecutionStrategy.Execute[TState,TResult](TState state, Func3 operation, Func3 verifySucceeded)
at Microsoft.EntityFrameworkCore.Query.Internal.SingleQueryingEnumerable1.Enumerator.MoveNext() ClientConnectionId:iuweui2yifgoi2 Error Number:-2,State:0,Class:11 Microsoft.Data.SqlClient.SqlException (0x80131904): Connection Timeout Expired. The timeout period elapsed during the post-login phase. The connection could have timed out while waiting for server to complete the login process and respond; Or it could have timed out while attempting to create multiple active connections. The duration spent while attempting to connect to this server was - [Pre-Login] initialization=15415; handshake=41566; [Login] initialization=0; authentication=0; [Post-Login] complete=3233; ---> System.ComponentModel.Win32Exception (258): Unknown error 258 at Microsoft.Data.SqlClient.TdsParser.ThrowExceptionAndWarning(TdsParserStateObject stateObj, Boolean callerHasConnectionLock, Boolean asyncClose) at Microsoft.Data.SqlClient.TdsParserStateObject.ThrowExceptionAndWarning(Boolean callerHasConnectionLock, Boolean asyncClose) at Microsoft.Data.SqlClient.TdsParserStateObject.ReadSniError(TdsParserStateObject stateObj, UInt32 error) at Microsoft.Data.SqlClient.TdsParserStateObject.ReadSniSyncOverAsync() at Microsoft.Data.SqlClient.TdsParserStateObject.TryReadNetworkPacket() at Microsoft.Data.SqlClient.TdsParserStateObject.TryPrepareBuffer() at Microsoft.Data.SqlClient.TdsParserStateObject.TryReadByte(Byte& value) at Microsoft.Data.SqlClient.TdsParser.TryRun(RunBehavior runBehavior, SqlCommand cmdHandler, SqlDataReader dataStream, BulkCopySimpleResultSet bulkCopyHandler, TdsParserStateObject stateObj, Boolean& dataReady) at Microsoft.Data.SqlClient.TdsParser.Run(RunBehavior runBehavior, SqlCommand cmdHandler, SqlDataReader dataStream, BulkCopySimpleResultSet bulkCopyHandler, TdsParserStateObject stateObj) at Microsoft.Data.SqlClient.SqlInternalConnectionTds.CompleteLogin(Boolean enlistOK) at Microsoft.Data.SqlClient.SqlInternalConnectionTds.AttemptOneLogin(ServerInfo serverInfo, String newPassword, SecureString newSecurePassword, Boolean ignoreSniOpenTimeout, TimeoutTimer timeout, Boolean withFailover) at Microsoft.Data.SqlClient.SqlInternalConnectionTds.LoginNoFailover(ServerInfo serverInfo, String newPassword, SecureString newSecurePassword, Boolean redirectedUserInstance, SqlConnectionString connectionOptions, SqlCredential credential, TimeoutTimer timeout) at Microsoft.Data.SqlClient.SqlInternalConnectionTds.OpenLoginEnlist(TimeoutTimer timeout, SqlConnectionString connectionOptions, SqlCredential credential, String newPassword, SecureString newSecurePassword, Boolean redirectedUserInstance) at Microsoft.Data.SqlClient.SqlInternalConnectionTds..ctor(DbConnectionPoolIdentity identity, SqlConnectionString connectionOptions, SqlCredential credential, Object providerInfo, String newPassword, SecureString newSecurePassword, Boolean redirectedUserInstance, SqlConnectionString userConnectionOptions, SessionData reconnectSessionData, Boolean applyTransientFaultHandling, String accessToken, DbConnectionPool pool) at Microsoft.Data.SqlClient.SqlConnectionFactory.CreateConnection(DbConnectionOptions options, DbConnectionPoolKey poolKey, Object poolGroupProviderInfo, DbConnectionPool pool, DbConnection owningConnection, DbConnectionOptions userOptions) at Microsoft.Data.ProviderBase.DbConnectionFactory.CreateNonPooledConnection(DbConnection owningConnection, DbConnectionPoolGroup poolGroup, DbConnectionOptions userOptions) at Microsoft.Data.ProviderBase.DbConnectionFactory.TryGetConnection(DbConnection owningConnection, TaskCompletionSource1 retry, DbConnectionOptions userOptions, DbConnectionInternal oldConnection, DbConnectionInternal& connection)
at Microsoft.Data.ProviderBase.DbConnectionInternal.TryOpenConnectionInternal(DbConnection outerConnection, DbConnectionFactory connectionFactory, TaskCompletionSource1 retry, DbConnectionOptions userOptions) at Microsoft.Data.SqlClient.SqlConnection.TryOpen(TaskCompletionSource1 retry, SqlConnectionOverrides overrides)
at Microsoft.Data.SqlClient.SqlConnection.Open(SqlConnectionOverrides overrides)
at Microsoft.EntityFrameworkCore.Storage.RelationalConnection.OpenInternal(Boolean errorsExpected)
at Microsoft.EntityFrameworkCore.Storage.RelationalConnection.Open(Boolean errorsExpected)
at Microsoft.EntityFrameworkCore.Storage.RelationalCommand.ExecuteReader(RelationalCommandParameterObject parameterObject)
at Microsoft.EntityFrameworkCore.Query.Internal.SingleQueryingEnumerable1.Enumerator.InitializeReader(Enumerator enumerator) at Microsoft.EntityFrameworkCore.SqlServer.Storage.Internal.SqlServerExecutionStrategy.Execute[TState,TResult](TState state, Func3 operation, Func3 verifySucceeded) at Microsoft.EntityFrameworkCore.Query.Internal.SingleQueryingEnumerable1.Enumerator.MoveNext()
ClientConnectionId:12413401-134134
Error Number:-2,State:0,Class:11

The 'value too large' is just the grok filter saying it doesn't want to pollute the logstash logs with long messages, so it just prints the 255 characters when an error occurs. The underlying error is the timeout matching the pattern.

A timeout occurs when the grok filter spends too much time failing to match the pattern to the field. My first suggestion is read this blog post about grok from Elastic. My second suggestion is to re-read it.

DATA can match anything, or nothing. You need something that will let the regexp engine anchor it to speed things up. A text string like "ClientId" would work, except that you have made several of those optional. Options are expensive because when the match fails the regexp engine has to explore both choices. And that is both choices of every optional field, which causes an exponential increase in the cost of the pattern.

GREEDYDATA is even more expensive, best practice is (as you have) only use GREEDYDATA at the end of a pattern to consume "everything that is left".

To improve the performance of your pattern I would suggest replacing DATA with a more specific pattern where possible. For example, in many cases NOTSPACE will work. Can a tenant/client/session/request id contain a space? If not, NOTSPACE will be cheaper than DATA.

Also, you probably do not have a requirement to match the whole field with a single pattern. The following will likely be faster, except in this case DATA may match nothing (resulting in any empty match that gets discarded).

grok {
    match => {
        break_on_match => false
        "message" => [
            "^%{DATA:[event][time]} \[%{DATA:[event][thread]}\] %{DATA:[event][level]} TenantId: %{DATA:[tenant][id]}",
            "ClientId: %{DATA:[client][id]}",
            "SessionId: %{DATA:[session][id]})",
            "RequestId: %{DATA:[request][id]}",
            "Method: %{DATA:[request][method]} %{DATA:[request][url]}"
            "StatusCode: %{DATA:[response][status][code]}",
            "Request Headers: %{DATA:[request][headers]}",
            "Response Headers: %{DATA:[response][headers]}",
            "Message: %{GREEDYDATA:[response][message]}"
        ]
    }
}

Using NOTSPACE may fix this, or if you want to consume the rest of the line then use a custom pattern. For example, replace ClientId: %{DATA:[client][id]} with ClientId: (?<[client][id]>[^\n]*).

1 Like

I would add few more things...
You can increase timeout -timeout_millis, which may slow down performances.
Add id to measure improvement performances: http://localhost:9600/_node/stats/pipelines?pretty

grok {
    match => {
        break_on_match => false
        "message" => [...]
    }
    id => "grok" 
}

Since this message is sooo long, think about dissect parser, I think it would be possible.

@Badger thanks for advise, I will change DATA one by one and check if pattern is working fine
@Rios it's a good idea to change message from grok to dissect because I don't need to parse latest message field. Also I saw that big message field is when event level error, so I added one IF.
But it didn't work.

        grok {
            match => [ "message", "%{DATA:[event][time]} \[%{DATA:[event][thread]}\] %{DATA:[event][level]} TenantId: %{NOTSPACE:[tenant][id]}( ClientId: %{NOTSPACE:[client][id]})?( SessionId: %{NOTSPACE:[session][id]})? RequestId: (%{DATA:[request][id]} )?Method: %{DATA:[request][method]} %{DATA:[request][url]}( StatusCode: %{DATA:[response][status][code]})? Request Headers: %{DATA:[request][headers]} Response Headers: %{GREEDYDATA:[response][headers]} Message: %{GREEDYDATA:[response][message]}" ]
  	}
  	}
  	if ([event][level] =~ "ERROR")
  	{
  		dissect
  		{
  	mapping => [ "[response][message]", "%{messageee}" ]
  	}
    }

Any advise?

OK, I've changed DATA to NOSPACE, error is missing now.
But log cannot be parsed, and I didn't see any error or message. Tried timeout_millis => 30000 and 300000
Who can assist?

            grok {
				break_on_match => false
				timeout_millis => 30000
                match => [ "message", "^%{NOSPACE:[event][time]} \[%{NOSPACE:[event][thread]}\] %{NOSPACE:[event][level]} TenantId: %{NOSPACE:[tenant][id]}( ClientId: %{NOSPACE:[client][id]})?( SessionId: %{NOSPACE:[session][id]})? RequestId: %{NOSPACE:[request][id]} Method: %{NOSPACE:[request][method]} %{NOSPACE:[request][url]}( StatusCode: %{NOSPACE:[response][status][code]})? Request Headers: %{NOSPACE:[request][headers]} Response Headers: %{NOSPACE:[response][headers]} Message: %{GREEDYDATA:[response][message]}$" ]
			}

That should be NOTSPACE, not NOSPACE, and using an undefined pattern should be producing an error message.

Sorry, my mistake. Bad copy-paste with replace. Im using NOTSPACE.
So, I have tested once again, and there is no any warn or error. This pattern doesn't work.
Full filter pattern

filter {
	if [type] == "cloudwatch_cc" {

		mutate { gsub => [ "message", "\r\n", "" ] }

		if [cloudwatch_logs][log_group] =~ /.+/ {
			grok {
                break_on_match => false
				match => [ "[cloudwatch_logs][log_group]", "/%{DATA:[event][environment]}/%{DATA:[tenant][code]}/%{DATA:[event][source]}/%{GREEDYDATA:[event][application]}"]
				add_field => { "[@metadata][index]" => "aws.cloudwatch" }
				add_field => { "[@metadata][status]" => "parsed" }
			}
		if [event][source] == "application"{
			grok {
				break_on_match => false
                timeout_millis => 30000
                match => [ "message", "^%{NOTSPACE:[event][time]} \[%{NOTSPACE:[event][thread]}\] %{NOTSPACE:[event][level]} TenantId: %{NOTSPACE:[tenant][id]}( ClientId: %{NOTSPACE:[client][id]})?( SessionId: %{NOTSPACE:[session][id]})? RequestId: %{NOTSPACE:[request][id]} Method: %{NOTSPACE:[request][method]} %{NOTSPACE:[request][url]}( StatusCode: %{NOTSPACE:[response][status][code]})? Request Headers: %{NOTSPACE:[request][headers]} Response Headers: %{NOTSPACE:[response][headers]} Message: %{GREEDYDATA:[response][message]}$" ]
			}
		}		
		if [tenant][code] == "log4net" or [tenant][code] == "landing" {
			mutate { replace => { "[tenant][code]" => "UNKNOWN" } }
		}
		mutate { remove_field => [ "message", "[type]" ] }
		}
	ruby { code => "event.set('[@metadata][index_date]', event.get('@timestamp').time.utc.strftime('%Y-%m'))" }
	}
}

Also strange thing, only first if [cloudwatch_logs][log_group] working, because I see parsed fields in kibana
added image

Summary

As I understood issue is in multiline last field message, any suggestions?

Fixed
separate grok to few another grok's.

  	grok {
            match => [ "message", "%{DATA:[event][time]} \[%{DATA:[event][thread]}\] %{DATA:[event][level]} TenantId: %{DATA:[tenant][id]}( ClientId: %{DATA:[client][id]})?( SessionId: %{DATA:[session][id]})?( UserId: %{DATA:[user][id]})? RequestId: (%{DATA:[request][id]} )?Method: %{DATA:[request][method]} %{DATA:[request][url]}( StatusCode: %{DATA:[response][status][code]})?" ]
  	}
  	grok {
            match => [ "message", "Request Headers: %{GREEDYDATA:[request][headers]}" ]
  	}
  	grok {
            match => [ "message", "Response Headers: %{GREEDYDATA:[response][headers]}" ]
  	}
  	grok {
            match => [ "message", "Message: %{GREEDYDATA:[response][message]}" ]
  	}

Trying replace DATA and GREEDYDATA to optimise patternt.