How to pull specific data from a single log line and index that data as fields available in every document


#1

Goal is: for each line in the log, there should be a document in elastic containing the 'message' (text after time stamp). Each document should also contain fields for the the project name, plan name, and build #. <--this is where I'm getting stuck

example log snippet (first two lines, labeled line 1 and 2):
line 1: simple 01-Jan-2016 14:26:01 Build TestProj - Framework Code - Build #25 (TST-FC-25) started building on agent .NET Core 2
line 2: simple 01-Jan-2016 14:26:01 .NET-related builds, tests and publishing.

I have a Grok to get and create the fields I want - build name, build number, and project name (and have them as fields in elastic):

%{NOTSPACE:log_entrytype}%{SPACE}(?<timestamp>(?:(?:0[1-9])|(?:[12][0-9])|(?:3[01])|[1-9])-\b(?:Jan(?:uary)?|Feb(?:ruary)?|Mar(?:ch)?|Apr(?:il)?|May|Jun(?:e)?|Jul(?:y)?|Aug(?:ust)?|Sep(?:tember)?|Oct(?:ober)?|Nov(?:ember)?|Dec(?:ember)?)\b-(?>\d\d){1,2}\s*(?!<[0-9])%{HOUR}:%{MINUTE}(?::%{SECOND})(?![0-9]))%{SPACE}Build%{SPACE}%{DATA:BamProjName}%{SPACE}-%{SPACE}%{DATA:BamBuildName}%{SPACE}-%{SPACE}Build%{SPACE}#%{NUMBER:BamBuildNum}

Is my end goal achievable with logstash and the Grok plugin alone? Or do I need ruby here ?

***NOTE: I'm using filebeat for shipping logs, and elastic does not recommend the multiline codec , so I'm curious what my other options are (assuming the multiline codec is even a viable option).

Going the ruby route assuming that is the best way...here is what I have which creates the field in elastic however it is coming up as "-" (I assume that means null or empty):

      code => "
                       @@projName = 'lineBelowSeemsToPopulateWithEmptyStr'
                       @@projName = event['BamProjName']
                       event['BamProjectName'] = @@projName if 
                       @@projName.nil? || @@projName.empty?
                     "

#2

Ping! Is there anybody...out there...Is this not solvable?


#3

@newb assuming each of your build will have start and end events when a start event occurs along with writing to elastic search write to a file creating name value pairs like
Project Name, Sample
Build Name, Sample
Build Number, Sample

Now in the grok you will have two sections

If some condition
Use the grok for start or end event
Write to elasticsearch and file
Else
Use grok for normal lines
Use translate filter to lookup data from the csv file

Hope the explanation is not too confusing.


#4

Not sure I understand this.

  1. Within my logstash Filter I need a grok under a condition (say, that this specific line is hit) then write this info to a file on disk, in csv format in your example? Using standard grok filter plugin only? no multiline, nothing else?
  2. In the condition where all other lines are hit, use separate grok filter for those lines and also under this same condition use the translate filter to pull the data from the (csv) file on disk (but I don't necessarily need them csv plugin?)

Sounds like it could work, but I don't like all of this 'write to a file' business, confused if I should even be using logstash for this then? Should I? Seems like a fairly typical log parsing scenario...


(system) #5

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.