Is there a way to keep variables at a global level?!
Here is an excerpt from the log file that we are processing. As you can see there is data about CustomerAccountNumber, and ContractNumber. The trouble is that the key piece of data that we need to use to tie all this data together is only available on one of the log lines above (not shown)....
So this begs a question: is there a way to preserve some key piece of data from earlier lines somewhere inside the pipeline so we could refer to it later when we have the Customer Account data and Contract number, etc?!
The @metadata comes close, but as far as I know that's available per Event, so that doesn't help me here.
Customer Account Number=233322 Contract Number=CFTFGXR Work Number=W336C Status=Billing Failure
Billing: ID=68183 Status=Billing Failure Billing Date=2016-01-28 Billing Amount=3494.60 Invoice#7328807 BCC=1B13
CF11130E=Incident billing dates not within work number start and end dates
CF11146E=Incident charge bill thru date is later than work number end date
Not sure how to answer your question about "unique events" - they are unique, yet they are all related to an Invoice in a Request. So the hierarchy of the data is something like this:
These separate lines in the log are all related to a specific RequestId that is listed only once somewhere in the log above. By the time I read the customerAccountNumber, contractNumber, and workNumber I no longer have 'visibility' or knowledge of the RequestId from above...
RequestId > has many Invoices, and Invoice has customerAccount#, Contract#, Work#, Billing#, Status, BillingDate, Amount, Invoice#, BCC
There are multiple RequestIds in the same log, each of them having numerous Invoices under it.
Was thinking - if I come across the RequestId and have a way to preserve it across multiple Events, then I can relate all subsequent invoice details to that same RequestId until it changes. When it changes, I would then mark all details that follow to that new RequestId, and so on to the end.
Thx for sharing. Looking at this code with my beginner Ruby skills, it looks like you initialize a Map structure, into which you add a Year value which you extract from the Event's timestamp.
The Else clause seems to replace all instances of Jan,Feb, .... Dec with the Year that you've stored in the map previously or current Year.
I understand the concept in general. Some questions on your example:
At what point in the config did you insert this section of the custom code?
What is the scope of this @@map variable? I think this is a class variable... is this what makes it 'persistent' between different events?
Why the choice of a map Enumeration when you're only keeping track of one item?! Just curious
Here is my data and an illustration of what I need to extract and then add to each event:
Just like the illustration shows above, once I encounter the file name that I'm looking for (eg., req_output_ALL_ALL_IC2ECFTD_580.xml) I'd like to preserve that part in some save_variable, and then add it to the end of each of the specific events that will have carefully extracted, while skipping others. In other words, I need to "decorate" certain events with data from the save_variable.
Then, If I encounter another Response File, preserve that one (discard the old one), and then use this new one to decorate the events that follow.
I guess I have no idea if such a feat could be done with "in-line" Ruby code here in this config file, or if it requires a new Filter to be created.
# Get the main data from the logs
grok {
match => {
"message" => [
"(?<tslice>%{DATE_EU} .... %{GREEDYDATA:cftsOutputFilePath}"
]
}
}
if "_grokparsefailure" in [tags] {
drop { }
}
# If you found cftsOutputFile, print out that element you found
if [cftsOutputFilePath] =~ /.+/ {
ruby{
init => "@respFilename = event['cftsOutputFilePath']"
code => "puts @respFilename"
}
drop{ }
}
Attempting to do 'baby steps' by trying to isolate the existence of that one field, and then, attempting to assign it to a simple local variable, and then simply print it out.... child's play, yet it doesn't work for me...
The runtime error is:
undefined local variable or method `event' for #LogStash::Filters::Ruby:0x6876a023
Init runs before any log lines are parsed, it's just for initializeing state that needs to exist before you start; so there's no event yet. You want that code in the code string.
You might also be able to use: https://www.elastic.co/guide/en/logstash/current/plugins-filters-elasticsearch.html . However, the problem here is that there's a potential for races where the you need to look something up that isn't in Elasticsearch quite yet. If you can do your processing in two phases this should work however. It will of course be slower than the ruby filter since it needs to do IO over the network.
So now that I have that variable in ruby, how do I add @@respFilename to the Event? This is the last step, I've seen examples but they showed how to add a brand new event, I need to augment an existing event with this data as an additional field. How do I do that?
Answering my own question here, for the benefit of future readers who may stumble upon this topic:
I realized through trial and error that the event variable that's exposed/available to the ruby filter can be used to retrieve individual pieces from the event. Thus, I was able to retrieve the message section of the event using event['message'] command. It was also a revelation to me that the message is a String variable that can be manipulated, appended to. As a novice Rubyist and not knowing the variable types, it took a while to arrive at this, even though this answer was simple. Retrieve the message variable, test for a condition, and then simply concatenate the message with the additional piece of data in a NVP format, thus decorating the message portion of the event. Here is the piece of code that does the trick for me:
ruby{
code => "(event['message'] = event['message'] + ' cftsFileName=' + @@respFilename) if (event['message'].include?'Contract Number' and event['message'].include?'Work Number')"
}
Is there a documentation link that would describe additional actions that you can do on that event?! How to cancel the event, how to check additional attributes, etc?
Hi man, thanks for your solution, it is working for me..
now I wanna contribute a bit. I see you are trying to store the class variable in to the logstash stream event.
I'm a bit confused. Don't you want to assign the class variable to the event value, not the other way around? And did you change this code now because logstash has updated since you wrote this post?
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.