Pipeline - grok - truncating at "\n" in [multiline] message from filebeat

Filebeat is configured to correctly process a mutline file
Using the ingest pipeline the grok processor extracts fields from the "message"
However it is truncating the message when the message contains the regex "\n"
note this worked perfeectly fine in [a very] early version of ELK

e.g file contains
2019-12-12 14:30:49.0276 ERROR Core.Processors.PurchaseInvoiceProcessor Failed to create Purchase Invoice for Purchase Order with Order # 'NNNNN' not found.
at Vantage.Services.SupplierLiveLink.Core.Processors.PurchaseInvoiceProcessor.<Process>d__7.MoveNext()

pipeline grok pattern
%{TIMESTAMP_ISO8601:timestamp} %{LEVEL:level} %{GREEDYDATA:msg}
, { "set" : { "field" : "message_original" , "value" : "{{message}}" } } , { "remove" : { "field" : "message" } } , { "rename" : { "field" : "msg" , "target_field" : "message" } }

but the message field appears in kibana as being truncated at the new line e.g.
message
Core.Processors.PurchaseInvoiceProcessor Failed to create Purchase Invoice for Purchase Order with Order # 'NNNNN' not found.

and the message_original field
2019-12-12 14:30:49.0276 ERROR Core.Processors.PurchaseInvoiceProcessor Failed to create Purchase Invoice for Purchase Order with Order # 'NNNNN' not found.\n at Vantage.Services.SupplierLiveLink.Core.Processors.PurchaseInvoiceProcessor.<Process>d__7.MoveNext()

do you have a fully reproducible example, that contains a filebeat configuration and a sample file, so others could try to see if the multiline message is indeed parsed as one (it doesn't look like it with the current information).

okay so another example (includes .net stack trace)

2019-12-12 01:46:08.3228 FATAL Services.Windows.Session DATA1L Failed to process changes: DATA: Failed to get customer details from database. (Execution Timeout Expired.  The timeout period elapsed prior to completion of the operation or the server is not responding.) Services.Core.Exception: DATA: Failed to get customer details from database. ---> System.Data.SqlClient.SqlException: Execution Timeout Expired.  The timeout period elapsed prior to completion of the operation or the server is not responding. ---> System.ComponentModel.Win32Exception: The wait operation timed out
   --- End of inner exception stack trace ---
   at System.Data.SqlClient.SqlCommand.EndExecuteReaderAsync(IAsyncResult asyncResult)
   at System.Threading.Tasks.TaskFactory`1.FromAsyncCoreLogic(IAsyncResult iar, Func`2 endFunction, Action`1 endAction, Task`1 promise, Boolean requiresSynchronization)
--- End of stack trace from previous location where exception was thrown ---
   at System.Runtime.CompilerServices.TaskAwaiter.ThrowForNonSuccess(Task task)
   at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task task)
   at System.Runtime.CompilerServices.TaskAwaiter.ValidateEnd(Task task)
   at Services.ThirdParty.ThirdPartyClient.<GetCustomerStatus>d__33.MoveNext()
   --- End of inner exception stack trace ---
   at Services.ThirdParty.ThirdPartyClient.<GetCustomerStatus>d__33.MoveNext()
--- End of stack trace from previous location where exception was thrown ---
   at System.Runtime.CompilerServices.TaskAwaiter.ThrowForNonSuccess(Task task)
   at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task task)
   at Services.ThirdParty.ThirdPartyClient.<GetChanges>d__32.MoveNext()
--- End of stack trace from previous location where exception was thrown ---
   at System.Runtime.CompilerServices.TaskAwaiter.ThrowForNonSuccess(Task task)
   at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task task)
   at System.Runtime.CompilerServices.TaskAwaiter.ValidateEnd(Task task)
   at Services.Windows.Session.<Process>d__19.MoveNext()

with the yml config for input being

- type: log
  enabled: false

  paths:
    - D:\logs\*-error-*.log

  multiline:
    pattern: ^20[0-9]{2}-[0-9]{2}-[0-9]{2} [0-9]{2}:[0-9]{2}:[0-9]{2}\.[0-9]{1,4}
    negate: true
    match: after

  close_removed: true
  clean_removed: true
  
  scan_frequency: 10s
  force_close_files: true

from the pipeline,

    , "processors" : [
        {
            "grok" : {
                "field" : "message"
                , "ignore_missing": true
                , "patterns" : [
                    "%{TIMESTAMP_ISO8601:timestamp} %{EG_LEVEL:level} %{GREEDYDATA:msg}"
                ]
                , "pattern_definitions" : {
                    "EG_LEVEL" : "FATAL|ERROR|WARN|INFO|DEBUG|TRACE"
                }
            }
        }
        , { 
            "set" : {
                "field" : "message_original"
                , "value" : "{{message}}"
                }
        }
        , {
            "remove" : {
                "field" : "message"
            }
        }        
        , { 
            "rename" : {
                "field" : "msg"
                , "target_field" : "message"
                }
        }
        , {
            "rename": {
              "field": "@timestamp"
              , "target_field": "event.created"
            }
        }
        , {
            "date": {
              "field": "timestamp",
              "target_field": "@timestamp",
              "formats": [
                "yyyy-MM-dd HH:mm:ss.SSSS"
              ],
              "timezone": "Europe/London"
            }
        }
        , {
            "remove": {
              "field": "timestamp"
            }
        }

using the above pipeline the Message is truncated at the first new line (\n), and the original message contains the full log message but as a single line (but with "\n" appearing) [unfortunately the trial version of elastic cloud has reached its data limit and is now not even showing me results in kibana so am unable to get another capture from kibana]

I have previously used a very similar config and pipeline which worked, but on a very early version of the agent and logstash

I justed tested this locally and it worked just fine for me on 7.5.0.

One thing that seems wrong, is that you have not enabled the the log harvester. This way, no data gets read at all by the filebeat.

Also, I can just make some assumptions about the rest of your configuration, as this is still not a reproducible example (I had to fill in some voids in the filebeat config for example), so I can just hope I did it somewhat similar to your setup.

Ref the "enabled" - i have several haversters configured and have copied one that is flagged as disabled. But despite it being disabled filebeat is still seems to be processing the file. but ignoring that.

I've restored the elastic cloud trial so can access the "original_message"
also added the "filebeat.yml" incase it's a config issue where you can see the pipeline config uses custom fields assigned to the root
note in the pipeline i have added a temp value to know which pipeline was used (and version)

###################### Filebeat Configuration Example #########################

# This file is an example configuration file highlighting only the most common
# options. The filebeat.reference.yml file from the same directory contains all the
# supported options with more comments. You can use it as a reference.
#
# You can find the full configuration reference here:
# https://www.elastic.co/guide/en/beats/filebeat/index.html

# For more available modules and options, please see the filebeat.reference.yml sample
# configuration file.

#=========================== Filebeat inputs =============================
filebeat.config.inputs:
    enabled: true
    path: config/*.yml

#============================= Filebeat modules ===============================

filebeat.config.modules:
  # Glob pattern for configuration loading
  path: ${path.config}/modules.d/*.yml

  # Set to true to enable config reloading
  reload.enabled: false

  # Period on which files under path should be checked for changes
  #reload.period: 10s

#==================== Elasticsearch template setting ==========================

setup.template.settings:
  index.number_of_shards: 1
  #index.codec: best_compression
  #_source.enabled: false

#defaut template :: will append %{[agent.versoin]}
setup.template.name: "filebeat"
setup.template.pattern: "filebeat-*"

#================================ General =====================================
  
# The name of the shipper that publishes the network data. It can be used to group
# all the transactions sent by a single shipper in the web interface.
#name:

# The tags of the shipper are included in their own field with each
# transaction published.
#tags: ["service-X", "web-tier"]

# Optional fields that you can specify to add additional information to the
# output.
#fields:
#  env: staging


#============================== Dashboards =====================================
# These settings control loading the sample dashboards to the Kibana index. Loading
# the dashboards is disabled by default and can be enabled either by setting the
# options here or by using the `setup` command.
#setup.dashboards.enabled: false

# The URL from where to download the dashboards archive. By default this URL
# has a value which is computed based on the Beat name and version. For released
# versions, this URL points to the dashboard archive on the artifacts.elastic.co
# website.
#setup.dashboards.url:

#============================== Kibana =====================================

# Starting with Beats version 6.0.0, the dashboards are loaded via the Kibana API.
# This requires a Kibana endpoint configuration.
setup.kibana:

  # Kibana Host
  # Scheme and port can be left out and will be set to the default (http and 5601)
  # In case you specify and additional path, the scheme is required: http://localhost:5601/path
  # IPv6 addresses should always be defined as: https://[2001:db8::1]:5601
  #host: "localhost:5601"

  # Kibana Space ID
  # ID of the Kibana Space into which the dashboards should be loaded. By default,
  # the Default Space will be used.
  #space.id:

#============================= Elastic Cloud ==================================

# These settings simplify using Filebeat with the Elastic Cloud (https://cloud.elastic.co/).

# The cloud.id setting overwrites the `output.elasticsearch.hosts` and
# `setup.kibana.host` options.
# You can find the `cloud.id` in the Elastic Cloud web UI.
cloud.id: cloudName:cloudPW

# The cloud.auth setting overwrites the `output.elasticsearch.username` and
# `output.elasticsearch.password` settings. The format is `<user>:<pass>`.
cloud.auth: user:PW

#================================ Outputs =====================================

# Configure what output to use when sending the data collected by the beat.

#-------------------------- Elasticsearch output ------------------------------
output.elasticsearch:
  # Array of hosts to connect to.
  hosts: ["localhost:9200"]

  # Optional protocol and basic auth credentials.
  #protocol: "https"
  #username: "elastic"
  #password: "changeme"
  
  pipelines:
    - pipeline: "exceptions_pipeline"
      and:
        - equals:
          service.type: "service"
        - equals:
          error: true

  indices:
    - index: "filebeat-%{[service.type]}-%{[agent.version]}-%{+yyyy.MM.dd}"
    
#----------------------------- Logstash output --------------------------------
#output.logstash:
  # The Logstash hosts
  #hosts: ["localhost:5044"]

  # Optional SSL. By default is off.
  # List of root certificates for HTTPS server verifications
  #ssl.certificate_authorities: ["/etc/pki/root/ca.pem"]

  # Certificate for SSL client authentication
  #ssl.certificate: "/etc/pki/client/cert.pem"

  # Client Certificate Key
  #ssl.key: "/etc/pki/client/cert.key"

#================================ Processors =====================================

# Configure processors to enhance or manipulate events generated by the beat.

processors:
  - add_host_metadata: ~
  - add_cloud_metadata: ~
  - add_docker_metadata: ~
  - add_kubernetes_metadata: ~

#================================ Logging =====================================

# Sets log level. The default log level is info.
# Available log levels are: error, warning, info, debug
#logging.level: debug

# At debug level, you can selectively enable logging only for some components.
# To enable all selectors use ["*"]. Examples of other selectors are "beat",
# "publish", "service".
#logging.selectors: ["*"]

#============================== X-Pack Monitoring ===============================
# filebeat can export internal metrics to a central Elasticsearch monitoring
# cluster.  This requires xpack monitoring to be enabled in Elasticsearch.  The
# reporting is disabled by default.

# Set to true to enable the monitoring reporter.
#monitoring.enabled: false

# Sets the UUID of the Elasticsearch cluster under which monitoring data for this
# Filebeat instance will appear in the Stack Monitoring UI. If output.elasticsearch
# is enabled, the UUID is derived from the Elasticsearch cluster referenced by output.elasticsearch.
#monitoring.cluster_uuid:

# Uncomment to send the metrics to Elasticsearch. Most settings from the
# Elasticsearch output are accepted here as well.
# Note that the settings should point to your Elasticsearch *monitoring* cluster.
# Any setting that is not set is automatically inherited from the Elasticsearch
# output configuration, so if you have the Elasticsearch output configured such
# that it is pointing to your Elasticsearch monitoring cluster, you can simply
# uncomment the following line.
#monitoring.elasticsearch:

#================================= Migration ==================================

# This allows to enable 6.7 migration aliases
#migration.6_to_7.enabled: true

so this entry in the file (on windows so each line is terminated with CRLF):

2019-12-17 10:45:39.6708 ERROR Services.Imap.Windows.ImapSession Socket Exception: TimedOut Message: A connection attempt failed because the connected party did not properly respond after a period of time, or established connection failed because connected host has failed to respond N.N.N.N:993 System.Net.Sockets.SocketException (0x80004005): A connection attempt failed because the connected party did not properly respond after a period of time, or established connection failed because connected host has failed to respond N.N.N.N:993
   at System.Net.Sockets.Socket.InternalEndConnect(IAsyncResult asyncResult)
   at System.Net.Sockets.Socket.EndConnect(IAsyncResult asyncResult)
   at System.Net.Sockets.TcpClient.EndConnect(IAsyncResult asyncResult)
   at System.Threading.Tasks.TaskFactory`1.FromAsyncCoreLogic(IAsyncResult iar, Func`2 endFunction, Action`1 endAction, Task`1 promise, Boolean requiresSynchronization)
--- End of stack trace from previous location where exception was thrown ---
   at System.Runtime.CompilerServices.TaskAwaiter.ThrowForNonSuccess(Task task)
   at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task task)
   at Core.Extensions.TaskExtensions.<WaitAsync>d__5.MoveNext()
--- End of stack trace from previous location where exception was thrown ---
   at System.Runtime.CompilerServices.TaskAwaiter.ThrowForNonSuccess(Task task)
   at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task task)
   at Core.Extensions.TaskExtensions.<WaitAsyncWithTimeout>d__4.MoveNext()
--- End of stack trace from previous location where exception was thrown ---
   at System.Runtime.CompilerServices.TaskAwaiter.ThrowForNonSuccess(Task task)
   at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task task)
   at Services.Imap.Client.ImapClient.<Connect>d__23.MoveNext()
--- End of stack trace from previous location where exception was thrown ---
   at System.Runtime.CompilerServices.TaskAwaiter.ThrowForNonSuccess(Task task)
   at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task task)
   at System.Runtime.CompilerServices.TaskAwaiter.ValidateEnd(Task task)
   at Services.Imap.Windows.ImapSession.<ProcessImapServer>d__20.MoveNext()

results in this value for original_message following the grok parser
you can see that it has "\n" for newlines, the value of message is then truncated that the first "\n"
neither are then rendered in kibana as multiline

2019-12-17 10:45:39.6708 ERROR Services.Imap.Windows.ImapSession  Socket Exception: TimedOut Message: A connection attempt failed because the connected party did not properly respond after a period of time, or established connection failed because connected host has failed to respond N.N.N.N:993 System.Net.Sockets.SocketException (0x80004005): A connection attempt failed because the connected party did not properly respond after a period of time, or established connection failed because connected host has failed to respond N.N.N.N:993\n   at System.Net.Sockets.Socket.InternalEndConnect(IAsyncResult asyncResult)\n   at System.Net.Sockets.Socket.EndConnect(IAsyncResult asyncResult)\n   at System.Net.Sockets.TcpClient.EndConnect(IAsyncResult asyncResult)\n   at System.Threading.Tasks.TaskFactory`1.FromAsyncCoreLogic(IAsyncResult iar, Func`2 endFunction, Action`1 endAction, Task`1 promise, Boolean requiresSynchronization)\n--- End of stack trace from previous location where exception was thrown ---\n   at System.Runtime.CompilerServices.TaskAwaiter.ThrowForNonSuccess(Task task)\n   at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task task)\n   at Core.Extensions.TaskExtensions.<WaitAsync>d__5.MoveNext()\n--- End of stack trace from previous location where exception was thrown ---\n   at System.Runtime.CompilerServices.TaskAwaiter.ThrowForNonSuccess(Task task)\n   at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task task)\n   at Core.Extensions.TaskExtensions.<WaitAsyncWithTimeout>d__4.MoveNext()\n--- End of stack trace from previous location where exception was thrown ---\n   at System.Runtime.CompilerServices.TaskAwaiter.ThrowForNonSuccess(Task task)\n   at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task task)\n   at Services.Imap.Client.ImapClient.<Connect>d__23.MoveNext()\n--- End of stack trace from previous location where exception was thrown ---\n   at System.Runtime.CompilerServices.TaskAwaiter.ThrowForNonSuccess(Task task)\n   at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task task)\n   at System.Runtime.CompilerServices.TaskAwaiter.ValidateEnd(Task task)\n   at Services.Imap.Windows.ImapSession.<ProcessImapServer>d__20.MoveNext()

and appears as message [after grok processing] - truncated at the first newline

Socket Exception: TimedOut Message: A connection attempt failed because the connected party did not properly respond after a period of time, or established connection failed because connected host has failed to respond N.N.N.N:993 System.Net.Sockets.SocketException (0x80004005): A connection attempt failed because the connected party did not properly respond after a period of time, or established connection failed because connected host has failed to respond N.N.N.N:993

i would expect both the original and parsed message to be multi line and not render "\n" (or be truncated) ?

so you do not get \n in your message once imported to elasticsearch
and the message appears in kibana as multiline ?

any ideas on this with the full example as it is still happening and makes reading exceptions impossible . Do I need to reformat the "\n" in the pipeline ?

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.