Indexing of couchdb views?


(sdistefano) #1

Imagine that I have a database of recipes where a database of recipes
where users can post comments, images and perhaps even comments on the
images.
I would like ES to be able to search on comments, but also to return
the parent object (the recipe), as the result, as the comment on
itself has little value. In short, I would like ES to return the
recipe because one of the comments on it has matched the query.
Therefore at the searchable index level, I would want a really long
json document where comments / images and etc would be lists of
dicts... so far so good.

The problem is that because of the size of the data, I would probably
have to split this into several documents of different types, that can
be joined. Otherwise updates to couchdb would lock and my application
would be constantly retrieving these massive objects, loading them
into memory, appending to them and storing them back .... over while a
loop!, so that conflicts can be handled. Not ideal.
However, I could make a couchdb view that generates this sort of long
index efficiently out of several JOINed objects. In this case it would
be great if I could get ES to keep this index friendly view, rather
than duplicating couchdb's format. And it would be fantastic if I
could get that index automatically updated whenever the db (and thus
the view) gets modified.

Is something like this possible? Or do you think I am looking at this
from the wrong point of view? I am rather new to nosql, so the latter
is likely...


(Shay Banon) #2

I am not sure I understand. You should be able to use _change API on a couchdb view, no?
On Thursday, January 27, 2011 at 10:32 PM, sdistefano wrote:

Imagine that I have a database of recipes where a database of recipes
where users can post comments, images and perhaps even comments on the
images.
I would like ES to be able to search on comments, but also to return
the parent object (the recipe), as the result, as the comment on
itself has little value. In short, I would like ES to return the
recipe because one of the comments on it has matched the query.
Therefore at the searchable index level, I would want a really long
json document where comments / images and etc would be lists of
dicts... so far so good.

The problem is that because of the size of the data, I would probably
have to split this into several documents of different types, that can
be joined. Otherwise updates to couchdb would lock and my application
would be constantly retrieving these massive objects, loading them
into memory, appending to them and storing them back .... over while a
loop!, so that conflicts can be handled. Not ideal.
However, I could make a couchdb view that generates this sort of long
index efficiently out of several JOINed objects. In this case it would
be great if I could get ES to keep this index friendly view, rather
than duplicating couchdb's format. And it would be fantastic if I
could get that index automatically updated whenever the db (and thus
the view) gets modified.

Is something like this possible? Or do you think I am looking at this
from the wrong point of view? I am rather new to nosql, so the latter
is likely...


(Mahendra M) #3

Hi,

I guess you are looking at from a different angle. The "recommended"
approach for doing this in CouchDB would be

  • Have a main document (say 'type' = 'recipe')

  • Have each comment as a separate doc
    {
    'type' : 'comment',
    'recipe' : 'recipe_doc_id',
    '...' : '....',
    '...' : '....'
    }

  • Index the main doc and each of the comments in ES. The ES CouchDB river
    will take care of this automatically.

  • While searching for a comment in ES, you can use your 'recipe' doc_id for
    a reference to your main recipe doc.

Delving into some CouchDB stuff here. (You can get more details on how to
use CouchDB for storing and retrieving comments from various CouchDB docs)
Your view for listing a recipe and it's comments can be like this.

if ( doc.type == 'comment' ) {
emit( [ doc.recipe, doc.date ], [ doc.title, doc.author ] );
}

With this, for a given recipe, you can load all the comments for a
particular recipe by date and then display them in a paginated manner. Alter
your view to suit your taste.

PS:

  • CouchDB views are not indexed in ES. Only the docs are.
  • Sorry for the top post, but it looked better.

Regards,
Mahendra

http://twitter.com/mahendra

On Fri, Jan 28, 2011 at 2:02 AM, sdistefano sdistefano@gmail.com wrote:

Imagine that I have a database of recipes where a database of recipes
where users can post comments, images and perhaps even comments on the
images.
I would like ES to be able to search on comments, but also to return
the parent object (the recipe), as the result, as the comment on
itself has little value. In short, I would like ES to return the
recipe because one of the comments on it has matched the query.
Therefore at the searchable index level, I would want a really long
json document where comments / images and etc would be lists of
dicts... so far so good.

The problem is that because of the size of the data, I would probably
have to split this into several documents of different types, that can
be joined. Otherwise updates to couchdb would lock and my application
would be constantly retrieving these massive objects, loading them
into memory, appending to them and storing them back .... over while a
loop!, so that conflicts can be handled. Not ideal.
However, I could make a couchdb view that generates this sort of long
index efficiently out of several JOINed objects. In this case it would
be great if I could get ES to keep this index friendly view, rather
than duplicating couchdb's format. And it would be fantastic if I
could get that index automatically updated whenever the db (and thus
the view) gets modified.

Is something like this possible? Or do you think I am looking at this
from the wrong point of view? I am rather new to nosql, so the latter
is likely...


(system) #4