Appending data from MongoDB into log files being processed by Logstash and parsed into Elasticsearch

kmin93 · July 11, 2017, 3:08am

Sorry about the title, my case really couldn't be explained with a single sentence.

Here is my situation:

I have a large set of log files (around 4GB) that I wish to parse with Logstash to use with Elastic stack (Logstash, Elasticsearch, Kibana).
In the logs, there is a serial number that I have successfully parsed with Logstash. This number corresponds to an index of a MongoDB collection. As each log is being parsed, I want to be able to query the collection with the parsed number and retrieve data which I want to include in the final output that is passed to Elasticsearch.

To make things clearer, here's a rough example. Suppose I have the raw log:

2017-11-20 14:24:14.011 123 log_number_one

Before the parsed log gets sent to Elasticsearch, I want to query my MongoDB collection with 123, and get data data1 and data2 to append to the document to be sent to Elasticsearch, so my end result will have fields be similar to something like:

{ 
    timestamp: 2017-11-20 14:24:14.011, 
    serial: 123, 
    data1: "foo", 
    data2: "bar", 
    log: log_number_one
}

An easier way to accomplish this, I assume, would be to simply preprocess the logs and run the numbers through MongoDB before parsing them through Logstash. However, seeing as though I have 4GBs' worth of logfiles, I was hoping for a way to achieve this in one single swoop. I was wondering whether my edge case would be solvable with the ruby filter plugin, where I could possibly run some arbitrary ruby code to do the above?

Any help / advice would be greatly appreciated!

Christian_Dahlqvist · July 11, 2017, 5:24am

Depending on the number of records and total size of the data in MongoDB (assuming it is a reasonable size data set), you may be able to extract the data into a file where each serial number is associated with a string representation of the data in JSON form. You could then use the translate filter to populate a field with the serialised JSON based on the serial number and then use a son filter to parse this and add it to the event.

kmin93 · July 11, 2017, 5:33am

Thank you so much for your help, I'll give it a try immediately!

EDIT: @Christian_Dahlqvist I haven't fully implemented it yet, but with the way things are going, I'm pretty sure it's going to work as intended. Thanks again!

EDIT 2: The MongoDB extract ended up being around 50MB, so I had to increase the JVM heap size for Logstash to be able to run normally. The method worked beautifully!

system · August 8, 2017, 5:33am

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Parsing mongodb input fully via logstash => elasticsearch Logstash	7	2644	November 2, 2021
Logstash: Parse Mongo Collection Name To Add As 'Type' in ElasticSearch Logstash	3	1430	October 23, 2017
Issue using logstash-input-jdbc to load mongo data Logstash	3	1251	May 1, 2019
MongoDB to Elasticsearch? Logstash	5	1195	August 2, 2023
Logstash cannot correctly send data of mongo collection to elasticsaearch Logstash	1	364	November 13, 2018

Appending data from MongoDB into log files being processed by Logstash and parsed into Elasticsearch

Related topics