I have a MongoDB database with different collections, but one of them has a field with a XML value. This XML could be greater than 16 MB, so I created a Jira ticket in MongoDB support to ask how to resolve this problem. The recommendation was to use GridFS (MongoDB store mechanism in binary format) and ES as search engine over this field to find specific tag inside the value.
I used the "mongo-connector" plugin to have the MongoDB data in ES. I download the elasticsearch-mapper-attachments plugin to support the GridFS implementation, and I understood that ES, when receives this kind of data, could index it and search by whatever String content.
FYI - I opened another ticket in your Website but asking about a problem to do queries using the query engine over MongoDB GridFS in ES.
So, I would like to know when we have data loaded via MongoDB (using GridFS), if it is a good idea to do search over large XML.
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.