Elasticsearch mongodb river with GridFS attached to DBObject problem


(Zoran Jeremic) #1

Hi,

I'm using Elasticsearch mongodb river to index documents stored in mongodb,
but
I couln't find appropriate documentation clearly explaining how to solve
the
problem I have. Document has a form of DBObject containing several custom
fields and GridFSInputFile attached as "file" field.
I specified a river like:
PUT _meta
{
"type": "mongodb",
"mongodb": {
"db": "platform4",
"collection": "documents2"
},
"index": {
"name": "inextweb_documents4",
"type": "documents"
}
}

MongoDB and GridFS store document and file properly. Elasticsearch river
also
maps document so it can be searched over the custom fields. However, I'm
not
able to search text within files.
I tried to add "gridfs":"true" and "gridfs":"fs.files" when I specified
river,
but that combination didn't work at all.I didn't use mappings at
initialization
as I don't know which fields could be added at runtime.
Could you please suggest what could be the problem here?
This is the document format:

{
"engine_id": "engineid1234",
"external_id": "http://en.wikipedia.org/wiki/Healthcare_in_India",
"contentType": "text/html",
"added": 1384822341,
"file": {
"_id": {
"$oid": "528ab645975c8c01cdb201fb"
},
"chunkSize": 262144,
"length": 171063,
"md5": "59cb83ecde3378749c58893567e021a3",
"filename": "9fd7cce6-4248-41f3-98f0-dffb498a061d",
"contentType": null,
"uploadDate": {
"$date": "2013-11-19T00:52:21.009Z"
},
"aliases": null
},
"fields": [
{
"title": {
"boost": 2.21,
"type": "text",
"value": "Healthcare in India"
}
},
{
"domain": {
"boost": 0,
"type": "text",
"value": "wikipedia.org"
}
}
]
}

Thanks,
Zoran

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.


(system) #2