CouchDB river : dynamic type from CouchDB document?


(Anton2) #1

We are investigating using ES with our CouchDB based application.
So far, our tests done with CouchDB River are good, but I wonders if
there is an option to correlate the type of CouchDB document (stored
as a CouchDB field) with ES type ?
This would allows us to adapt the mapping with the type of document
indexed.

For now, I can think only of creating a (filtered) river for each type
of documents, but this seems complicated and wasteful.

Thanks.


(Shay Banon) #2

This can be added to the river, an option to extract the type from the doc, and use that when indexing. It does mean that when deleting, we will need the doc as well from the _changes API to get the type, not sure if you get it or not with the _changes stream.
On Monday, January 10, 2011 at 5:52 PM, Anton2 wrote:

We are investigating using ES with our CouchDB based application.
So far, our tests done with CouchDB River are good, but I wonders if
there is an option to correlate the type of CouchDB document (stored
as a CouchDB field) with ES type ?
This would allows us to adapt the mapping with the type of document
indexed.

For now, I can think only of creating a (filtered) river for each type
of documents, but this seems complicated and wasteful.

Thanks.


(Anton2) #3

On 10 jan, 17:26, Shay Banon shay.ba...@elasticsearch.com wrote:

This can be added to the river, an option to extract the type from the doc, and use that when indexing. It does mean that when deleting, we will need the doc as well from the _changes API to get the type, not sure if you get it or not with the _changes stream.

Looks like it doesn't, even with include_docs=true (this seems
reasonable, as the document may have been purged from couchdb during a
compact).

On my tests ES can find the deleted document from the id alone, by
searching on the _id field. However, this may not be applicable for
the river API ?
And if so, is there another way of adapting the type/mapping according
the the document beeing indexed ?

On a side note, I should say that we are very pleased and impressed by
the level of functionality and ease of use provided by Elastic Search.
Thanks again for all this work.


(Shay Banon) #4

In the _changes stream, you do get the doc id in it, so its easy to delete the matching doc in ES since the index name and type are already known.

If we don't get the actual docs back (which make sense from a couchdb perspective), then you can use what you suggested before, have a filter and just create several rivers.

-shay.banon
On Monday, January 10, 2011 at 7:16 PM, Anton2 wrote:

On 10 jan, 17:26, Shay Banon shay.ba...@elasticsearch.com wrote:

This can be added to the river, an option to extract the type from the doc, and use that when indexing. It does mean that when deleting, we will need the doc as well from the _changes API to get the type, not sure if you get it or not with the _changes stream.

Looks like it doesn't, even with include_docs=true (this seems
reasonable, as the document may have been purged from couchdb during a
compact).

On my tests ES can find the deleted document from the id alone, by
searching on the _id field. However, this may not be applicable for
the river API ?
And if so, is there another way of adapting the type/mapping according
the the document beeing indexed ?

On a side note, I should say that we are very pleased and impressed by
the level of functionality and ease of use provided by Elastic Search.
Thanks again for all this work.


(system) #5