Logstash conf, multiline & multimatch

adolfo.diaz · March 12, 2018, 5:57pm

Hi friends

We are trying to process some files with this format:

IMAGE 123-06-gestion 0 0 12 123-06-gestion_1520509780 CNGN_AS_BROADSOFT 0 NULL root i_dia_bi 1 0 1520509780 276 1521114580 0 0 23554400 470 1 1 0 CNGN_AS_BROADSOFT_1520509780_INCR.f NULL NULL 0 1 0 0 0 NULL 1 0 0 0 0 0 0 NULL 0 0 0 NULL 2794 0 0 62089 0 0 NULL NULL 0 0 0 0 NULL NULL 0 12 0 0
HISTO 0 0 0 0 0 0 0 0 0 0
FRAG 1 1 23554400 0 0 0 0 @aaaac fgh-02 262144 0 0 -1 4 1;PureDisk;fgh-02;NETAPP_DiskPool;PureDiskVolume;0 1521114580 0 65536 0 0 0 6 0 1520510056 1 1 NULL NULL 0 0
IMAGE nssa1123sor1-06-gestion 0 0 12 123-06-gestion_1520509780 CNGN_AS_BROADSOFT 0 NULL root i_dia_bi 1 0 1520509780 276 1521114580 0 0 23554400 470 1 1 0 CNGN_AS_BROADSOFT_1520509780_INCR.f NULL NULL 0 1 0 0 0 NULL 1 0 0 0 0 0 0 NULL 0 0 0 NULL 2794 0 0 62089 0 0 NULL NULL 0 0 0 0 NULL NULL 0 12 0 0
HISTO 0 0 0 0 0 0 0 0 0 0
FRAG 1 1 23554400 0 0 0 0 @aaaac fgh-02 262144 0 0 -1 4 1;PureDisk;fgh-02;NETAPP_DiskPool;PureDiskVolume;0 1521114580 0 65536 0 0 0 6 0 1520510056 1 1 NULL NULL 0 0
IMAGE 123-06-gestion 0 0 12 123-06-gestion_1520509780 CNGN_AS_BROADSOFT 0 NULL root i_dia_bi 1 0 1520509780 276 1521114580 0 0 23554400 470 1 1 0 CNGN_AS_BROADSOFT_1520509780_INCR.f NULL NULL 0 1 0 0 0 NULL 1 0 0 0 0 0 0 NULL 0 0 0 NULL 2794 0 0 62089 0 0 NULL NULL 0 0 0 0 NULL NULL 0 12 0 0
HISTO 0 0 0 0 0 0 0 0 0 0
FRAG 1 1 23554400 0 0 0 0 @aaaac sdf-02 262144 0 0 -1 4 1;PureDisk;sdf-02;NETAPP_DiskPool;PureDiskVolume;0 1521114580 0 65536 0 0 0 6 0 1520510056 1 1 NULL NULL 0 0
FRAG 1 1 23554400 0 0 0 0 @aaaac sdf-02 262144 0 0 -1 4 1;PureDisk;sdf-02;NETAPP_DiskPool;PureDiskVolume;0 1521114580 0 65536 0 0 0 6 0 1520510056 1 1 NULL NULL 0 0
FRAG 1 1 23554400 0 0 0 0 @aaaac sdf-02 262144 0 0 -1 4 1;PureDisk;sdf-02;NETAPP_DiskPool;PureDiskVolume;0 1521114580 0 65536 0 0 0 6 0 1520510056 1 1 NULL NULL 0 0

And we are getting some problems to adjust the configuration.
We need to process this information following one rule, "IMAGE" and every lines until the next IMAGE will be my document in elastic.
We are trying the following configuration, but the behavior is not as expected.
Please could you support us in this issue?
Thanks in advance.

input {
file {
path => "/BCKAud/data/BCKAud_bpimagelist*.txt"
type => "BCKAud_bpimagelist"
start_position => "beginning"
codec => multiline {
negate => true
pattern => "^IMAGE"
what => "previous"
}
}
}
filter {
if [type] == "BCKAud_bpimagelist" {

  if ([message] =~ /^HISTO / ) {
    drop {}
  }

  grok {
    match => {"message" => "IMAGE %{NOTSPACE:ClientName} %{NUMBER:Date1} %{NUMBER:Date2} %{NUMBER:Version} %{NOTSPACE:BackupID} %{NOTSPACE:ClassName} %{NUMBER:ClientType} %{NOTSPACE:ProxyClient} %{NOTSPACE:Creator} %{NOTSPACE:ScheduleLabel} %{NUMBER:ScheduleType} %{NUMBER:RetentionLevel} %{NUMBER:BackupTime} %{NUMBER:ElapsedTimeinSeconds} %{NUMBER:Expiration} %{NUMBER:Compression} %{NUMBER:Encryption} %{NUMBER:Kbytes} %{NUMBER:NumFiles} %{NUMBER:Copies} %{NUMBER:NumberFragments} %{NUMBER:FilesFileCompressed} %{NOTSPACE:FilesFile} %{NOTSPACE:SoftwareVersion} %{NOTSPACE:Name1} %{NUMBER:InputOptions} %{NUMBER:PrimaryCopy} %{NUMBER:ImageType} %{NUMBER:TrueImageRecoveryInfo} %{NUMBER:TrueImageRecoveryExpiration} %{NOTSPACE:Keywords} %{NUMBER:MPX} %{NUMBER:ExtendedSecurityInfo} %{NUMBER:IndividualFileRestorefromRaw} %{NUMBER:ImageDumpLevel} %{NUMBER:FileSystemOnly}"}

    match => {"message" => "FRAG %{NUMBER:CopyNumber} %{NUMBER:FragmentNumber} %{NUMBER:Kilobytes} %{NUMBER:Reminder} %{NUMBER:MediaType} %{NUMBER:Density} %{NUMBER:FileNumber} %{NOTSPACE:IdPATH} %{NOTSPACE:Hostname} %{NUMBER:BlockSize} %{NUMBER:Offset} %{NUMBER:MediaDate} %{NUMBER:DeviceWrttenOn} %{NUMBER:fflag} %{NOTSPACE:MediaDescriptor} %{NUMBER:Expiration} %{NUMBER:MPX} %{NUMBER:RetentionLevel} %{NUMBER:Checkpoint} %{NUMBER:ResumeNBR} %{NUMBER:MediaSeq} %{NUMBER:MediaBubType} %{NUMBER:TrytoKeepTime} %{NUMBER:CopyCreationTime} %{GREEDYDATA:Unused}"}

    remove_field => [ "message" ] 
  }

}
}

Badger · March 12, 2018, 6:28pm

What behaviour is expected, and what behaviour are you getting?

adolfo.diaz · March 13, 2018, 8:07am

Hi Badger
Thanks for your interest in our case.
In my previous message, it showed an example of a log file.
We need to process this file into with logstash, adding the content of the lines that come from IMAGE, HISTO and FRAG as a single elastic document.
The following set of lines that begin with that pattern will be the following document.
For this we have implemented the configuration that I attached.
But we can not process the 3 lines as a single message.
BR

Badger · March 13, 2018, 2:09pm

When I run with

input { file {
path => "/tmp/test.txt"
start_position => "beginning"
sincedb_path => "/dev/null"
codec => multiline {
negate => true
pattern => "^IMAGE"
what => "previous"
}
} }
output { stdout { codec => rubydebug } }

I get the lines concatenated

       "message" => "IMAGE nssa1123sor1-06-gestion 0 0 12 123-06-gestion_1520509780 CNGN_AS_BROADSOFT 0 NULL root i_dia_bi 1 0 1520509780 276 1521114580 0 0 23554400 470 1 1 0 CNGN_AS_BROADSOFT_1520509780_INCR.f NULL NULL 0 1 0 0 0 NULL 1 0 0 0 0 0 0 NULL 0 0 0 NULL 2794 0 0 62089 0 0 NULL NULL 0 0 0 0 NULL NULL 0 12 0 0\nHISTO 0 0 0 0 0 0 0 0 0 0\nFRAG 1 1 23554400 0 0 0 0 @aaaac fgh-02 262144 0 0 -1 4 1;PureDisk;fgh-02;NETAPP_DiskPool;PureDiskVolume;0 1521114580 0 65536 0 0 0 6 0 1520510056 1 1 NULL NULL 0 0",

adolfo.diaz · March 13, 2018, 3:02pm

Hi Badger
Yes, the behavior you indicate is correct.
My problem was that the lines with FRAG, in my log file, are variable in number.
Sometimes we have one FRAG line, in others we have two or three FRAG lines.
We need to be able to create a pattern that contemplates this situation.
Right now we are saving all the FRAG lines in a single FRAG_Lines field.
filter {
if [type] == "BCKAud_bpimagelist" {
grok {
match => {"message" => "IMAGE %{NOTSPACE:ClientName} %{NUMBER:Date1} %{NUMBER:Date2} %{NUMBER:Version} %{NOTSPACE:BackupID} %{NOTSPACE:ClassName} %{NUMBER:ClientType} %{NOTSPACE:ProxyClient} %{NOTSPACE:Creator} %{NOTSPACE:ScheduleLabel} %{NUMBER:ScheduleType} %{NUMBER:RetentionLevel} %{NUMBER:BackupTime} %{NUMBER:ElapsedTimeinSeconds} %{NUMBER:Expiration} %{NUMBER:Compression} %{NUMBER:Encryption} %{NUMBER:Kbytes} %{NUMBER:NumFiles} %{NUMBER:Copies} %{NUMBER:NumberFragments} %{NUMBER:FilesFileCompressed} %{NOTSPACE:FilesFile} %{NOTSPACE:SoftwareVersion} %{NOTSPACE:Name1} %{NUMBER:InputOptions} %{NUMBER:PrimaryCopy} %{NUMBER:ImageType} %{NUMBER:TrueImageRecoveryInfo} %{NUMBER:TrueImageRecoveryExpiration} %{NOTSPACE:Keywords} %{NUMBER:MPX} %{NUMBER:ExtendedSecurityInfo} %{NUMBER:IndividualFileRestorefromRaw} %{NUMBER:ImageDumpLevel} %{NUMBER:FileSystemOnly}"}
}
grok {
match => {"message" => "HISTO %{NUMBER:Histodata0} %{NUMBER:Histodata1} %{NUMBER:Histodata2} %{NUMBER:Histodata3} %{NUMBER:Histodata4} %{NUMBER:Histodata5} %{NUMBER:Histodata6} %{NUMBER:Histodata7} %{NUMBER:Histodata8} %{NUMBER:Histodata9}%{GREEDYDATA:FRAG_lines}"}
}
Later we will see how we can process that information.
Any suggestions to process this type of file?
BR

system · April 10, 2018, 3:02pm

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.