Multiline, Multi Field input question - newbie here

newmember · November 10, 2018, 9:28pm

I am putting together bits and pieces from examples to create my first custom filebeats input.

I have 10s of thousands of these files, that I would like to read into ES.

cdv_nrings=8
cdv_phone=16188835888
cdv_informat=NONE
cdv_tries=1
cdv_callTime=0
cdv_newApp=arcVXML2
cdv_retryInterval=0
cdv_initialScript=http://10.30.30.17:8080/pre/vui/aOut/1176825
cdv_applicationData=15740
# 2015/09/18 17:03:16 
##Fri Sep 18 17:03:59 2015
#OutboundRetCode:603 VXML Event: error.com.arc.tel_initiatecall.tel_failure

I was thinking that I want to define some group names to match my unique field names:
My regex doesn't work in the online testers, I cant see to describe the new line properly, maybe that's not my problem?

multiline.pattern: '=(?P<cdv_nrings>re\w+$)\n=(?P<cdv_phone>re\w+$)\n=(?P<cdv_informat>re\w+$)\n=(?P<cdv_tries>re\w+$)\n=(?P<cdv_callTime>re\w+$)\n=(?P<cdv_newApp>re\w+$)\n=(?P<cdv_retryInterval>re\w+$)\n=(?P<cdv_initialScript>re\w+$)\n=(?P<cdv_applicationData>re\w+$)\n#(?P<date>re\w+$)\n##(?P<daydate>re\w+$)\n#(?P<OutboundRetCode>re\w+$)'

Then in my filebeat.yml file I would match the group name to the field name:

- type: log
  enabled: true
  close_eof: true
  paths:
    - C:\OCS\work\0.CDF*
  fields:
    log_type: work_active
    cdv_nrings: cdv_nrings
    cdv_phone: cdv_phone
    cdv_informat: cdv_informat
    cdv_tries: cdv_tries
    cdv_callTime: cdv_callTime
    cdv_newApp: cdv_newApp
    cdv_retryInterval: cdv_retryInterval
    cdv_initialScript: cdv_initialScript
    cdv_applicationData: cdv_applicationData
    date: date
    daydate: daydate
    OutboundRetCode: OutboundRetCode

  multiline.pattern: '=(?P<cdv_nrings>re\w+$)\n=(?P<cdv_phone>re\w+$)\n=(?P<cdv_informat>re\w+$)\n=(?P<cdv_tries>re\w+$)\n=(?P<cdv_callTime>re\w+$)\n=(?P<cdv_newApp>re\w+$)\n=(?P<cdv_retryInterval>re\w+$)\n=(?P<cdv_initialScript>re\w+$)\n=(?P<cdv_applicationData>re\w+$)\n#(?P<date>re\w+$)\n##(?P<daydate>re\w+$)\n#(?P<OutboundRetCode>re\w+$)'
  multiline.negate: false
  multiline.match: before

I tested my config and that passed:

C:\filebeat-6.4.3-windows-x86_64>filebeat test config filebeat.yml
Config OK

Then I would set something up for Kibana Template but I have not got to this part at this time.

I am on the correct track here?
How does my regex look?

Thanks

Update:
I tried this regex and I got closer but not perfect

^(.+)=(.+)(\r\n\s+(.+))|^#\s(.+)(\r\n\s+(.+))|^##(.+)(\r\n\s+(.+))*|^#(.+):(.+)(\r\n\s

+(.+))

ShaneP · November 11, 2018, 5:55am

Wow, really kissanime great work mate.Keep it up you are almost letgo on the level of perfection.

Regards,
Shane.

steffens · November 12, 2018, 8:59pm

Have you got multiple consecutive events? Also use ---- separator (or some other marker) to show where exactly you want to split the multiline. Having some more samples helps in seeing and understanding a pattern.

newmember · November 13, 2018, 12:42am

Thanks

The group of 12 lines at the top of the case equals one file.
We generate 100,000+ files a day all with the same layout.

Thanks I hope this helps

newmember · November 13, 2018, 12:47am

I was reading up on how grok works, not that I have a grok log statement but I get the idea of moving the regex statement from identifying groups in regex and just creating a regex per feild and adding those statements to yml file.

I'll try my thought tonight an update the case.

newmember · November 13, 2018, 8:09am

I updated the yml to use kv and I deleted the registry then ran filebeat,exe again.
It started but nothing loaded.

Thoughts please?

- type: log
  enabled: true
  close_eof: true
  paths:
    - C:\OCS\work\0.CDF*

     filter {
       kv {
         source => "message"
         field_split => "="
         <b>include_keys => ["cdv_nrings", "cdv_phone", "cdv_informat", "cdv_tries", "cdv_callTime", "cdv_newApp", "cdv_retryInterval", "cdv_initialScript", "cdv_applicationData", "date", "daydate", "OutboundRetCode"]</b>
         trim => "<>[],"
         trimkey => "<>[]," 
         }
      }

steffens · November 13, 2018, 12:31pm

So one file == 1 event? In this case you don't need a complicated regex. Just try to capture everything, no matter the contents.

Grok or kv filter is not part of filebeat, but logstash or Ingest node. The filter config as used is a Logstash configuration. If you want to use it like this, publish the event to Logstash. If you want to turn your work into a filebeat module, better start with Ingest Node (config pipeline in elasticsearch output).

newmember · November 19, 2018, 7:41pm

Thanks

Is it best practice to have the message "cdv_informat=NONE" or should I create a field called "cdv_informat" with a sample value of "NONE" in this example?

Thanks

steffens · November 20, 2018, 10:29am

Better create a field. So you can query/search/visualise specific fields and kibana.

system · December 18, 2018, 10:29am

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Multiline pattern in filebeat Beats filebeat	2	292	February 12, 2019
Problem multiline pattern matching Beats filebeat	9	696	September 25, 2023
Multiline Parsing Beats filebeat	2	599	October 24, 2017
Filebeats multiline pattern help Beats filebeat	2	687	September 17, 2018
Multiline is not fetching desired result from filebeat to ES Beats filebeat	4	355	March 17, 2020

Multiline, Multi Field input question - newbie here

Related topics