I have read some old discussions about using MongoDB as a data source. I mean, using Logstash to bring data from MongoDB and send it to Elasticsearch. Basically, this (MongoDB Logstash Integration [Solved]) is the start point.
However, I have some questions:
Why there is no official logstash-input-mongodb plugin?
If not, is there the development of this plugin in the roadmap?
What would be the best way to bring a "slice" of data from MongoDB to Elasticsearch?
Is there any logstash-input-mongodb current updated?
Any way to run db.collection.aggregate against MongoDB and send the result to Elasticsearch?
As far as I know, there is no official logstash MongoDB Input plugin. But, there is an official MongoDB output plugin.
Concerning your question ,much of the power of open source comes from its community, so I think that it's okay that the community develop its own plugins, support them and embrace them. Elastic already developed more than 200 plugins. I did found an official unmaintained mongodb plugin. But why developing 2 repositories for the same task?
Logstash is based on pulling data. Therefore, I used the JDBC input plugin since it support scheduling for finding new rows/events by using rufus-scheduler. You can also use logstash http puller inupt , which also support scheduling, and exposing a REST service for your MongoDB.
Other alternative might be pushing data. Here's a collection of various mechanisms.
Probably it depends whether the JDBC Input plugin and the MongoDB connector supports the aggregation functions. But i'm pretty sure that the REST service fully supports MongoDB features.
Thank you for your answer. I was thinking if there is a lack of NoSQL input plugins. I didn't find for Casandra either.
My solution for a while is a shell script wrapping a mongo shell + javascript (mongo -eval...), the problem here is that the full process takes 6.5 hours.
If it comes out any other idea, I will be pleased to share.
Thank you for the suggestion. I have already tried this... This plugin is not well maintained and works well for bringing the whole collection. However, I need to bring a slice of it.
As I mentioned before the idea was having a baby step before pushing data to Elasticsearch. So, I wrapped a MongoDB query in a shell script which receives all parameter I need. The mongo query saves a CSV file which is handled by Logstash.
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.