Sorry about the title, my case really couldn't be explained with a single sentence.
Here is my situation:
- I have a large set of log files (around 4GB) that I wish to parse with Logstash to use with Elastic stack (Logstash, Elasticsearch, Kibana).
- In the logs, there is a serial number that I have successfully parsed with Logstash. This number corresponds to an index of a MongoDB collection. As each log is being parsed, I want to be able to query the collection with the parsed number and retrieve data which I want to include in the final output that is passed to Elasticsearch.
To make things clearer, here's a rough example. Suppose I have the raw log:
2017-11-20 14:24:14.011 123 log_number_one
Before the parsed log gets sent to Elasticsearch, I want to query my MongoDB collection with 123
, and get data data1
and data2
to append to the document to be sent to Elasticsearch, so my end result will have fields be similar to something like:
{
timestamp: 2017-11-20 14:24:14.011,
serial: 123,
data1: "foo",
data2: "bar",
log: log_number_one
}
An easier way to accomplish this, I assume, would be to simply preprocess the logs and run the numbers through MongoDB before parsing them through Logstash. However, seeing as though I have 4GBs' worth of logfiles, I was hoping for a way to achieve this in one single swoop. I was wondering whether my edge case would be solvable with the ruby filter plugin, where I could possibly run some arbitrary ruby code to do the above?
Any help / advice would be greatly appreciated!