Using elasticsearch to index couchdb tags

As per the subject, my documents in couchdb have a tags field which is an array.

Now I could of course do a view that output every single with the
relevant data, and I would have to do some map reduce I guess to be
able to handle if people wanted to query against multiple tags.

But considering I have elasticsearch river running to index couchdb I
was wondering if there is an easier way to do it?

For example a way to have a query for array has particular value?

If no - can you suggest a way to do it with elasticsearch?

I am a newbie to elasticsearch - although have been running it with no
problems for just this project quite a while there has been no need to
have anything but the core functionality until now.

I do have familiarity with Solr and know how I would do it there -
multiple tag fields, index those, have a special search handler for
tags.

Thanks,
Bryan Rasmussen

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

Not sure I fully understand the use case.
Your concern is not about the way you are sending documents from couchDb to elasticsearch, right?

You want to somehow display a TAG cloud, right?
You should give a look at: http://www.elasticsearch.org/guide/reference/api/search/facets/terms-facet/

If it's not the answer you were expected, could you update your question with some sample data and describe the result you want to have from that data?

HTH

David Pilato | Technical Advocate | Elasticsearch.com
@dadoonet | @elasticsearchfr | @scrutmydocs

Le 4 juil. 2013 à 02:06, bryan rasmussen rasmussen.bryan@gmail.com a écrit :

As per the subject, my documents in couchdb have a tags field which is an array.

Now I could of course do a view that output every single with the
relevant data, and I would have to do some map reduce I guess to be
able to handle if people wanted to query against multiple tags.

But considering I have elasticsearch river running to index couchdb I
was wondering if there is an easier way to do it?

For example a way to have a query for array has particular value?

If no - can you suggest a way to do it with elasticsearch?

I am a newbie to elasticsearch - although have been running it with no
problems for just this project quite a while there has been no need to
have anything but the core functionality until now.

I do have familiarity with Solr and know how I would do it there -
multiple tag fields, index those, have a special search handler for
tags.

Thanks,
Bryan Rasmussen

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

David Pilato david@pilato.fr Jul 04 09:17AM +0200

If it's not the answer you were expected, could you update your question with >some
sample data and describe the result you want to have from that data?

My main problem is that the way to handle tags in couchdb to do stuff like

  1. return all documents that have tag1 (where a tag can be
    anydescripting text applied to a document by a user)

and

  1. return all documents that have tag1 + tag2 + tag3

and so forth

tend to be verbose and expensive.

To do that in Solr is quite easy, but snce the search is actually not
that important a part of the application I didn't want to go through
the trouble of setting up, maintaining Solr, instead I just used
elasticsearch couchdb river.

So I would like a way to handle the problem in elasticsearch with
hopefully not a lot of work :slight_smile:

tags in my documents are represented as arrays.

so solutions I might expect to be possible would be:

a way to write a query that allows me to search for a value inside of
an array item, by using a special parameter in the query.

or

have a view that does not do the common couchdb thing of outputting a
document per tag but solves the problem in some other way, and
indexing that view?

or

create a special query handler that only indexes items in the array,
so that each array[i] becomes

tag = value of array[i]

which is of course the same as couchdb way of having a view output new
document per actual tag in tags array, except the way elasticsearch
does it will not have the drawbacks of couchdb writing multiple
documents to disk.

Thanks,
Bryan Rasmussen

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

I think I found a way to do what I want, but I'd like some feedback
before I start - especially as it will help me developing some other
applications with elasticsearch as the search engine that I am working
on right now -

question

On Thu, Jul 4, 2013 at 7:20 PM, bryan rasmussen
rasmussen.bryan@gmail.com wrote:

David Pilato david@pilato.fr Jul 04 09:17AM +0200

If it's not the answer you were expected, could you update your question with >some
sample data and describe the result you want to have from that data?

My main problem is that the way to handle tags in couchdb to do stuff like

  1. return all documents that have tag1 (where a tag can be
    anydescripting text applied to a document by a user)

and

  1. return all documents that have tag1 + tag2 + tag3

and so forth

tend to be verbose and expensive.

To do that in Solr is quite easy, but snce the search is actually not
that important a part of the application I didn't want to go through
the trouble of setting up, maintaining Solr, instead I just used
elasticsearch couchdb river.

So I would like a way to handle the problem in elasticsearch with
hopefully not a lot of work :slight_smile:

tags in my documents are represented as arrays.

so solutions I might expect to be possible would be:

a way to write a query that allows me to search for a value inside of
an array item, by using a special parameter in the query.

or

have a view that does not do the common couchdb thing of outputting a
document per tag but solves the problem in some other way, and
indexing that view?

or

create a special query handler that only indexes items in the array,
so that each array[i] becomes

tag = value of array[i]

which is of course the same as couchdb way of having a view output new
document per actual tag in tags array, except the way elasticsearch
does it will not have the drawbacks of couchdb writing multiple
documents to disk.

Thanks,
Bryan Rasmussen

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

oops, sent that last one accidentally

I think I found a way to do what I want, but I'd like some feedback
before I start - especially as it will help me developing some other
applications with elasticsearch as the search engine that I am working
on right now -

given that: "...elasticsearch can easily have several couchdb rivers
(and other types of rivers) running at the same time, all pointing to
different databases and indexing them into different indices (or the
same index, you choose) using the same elasticsearch cluster."

Can I index just a view in a river and then do something like the following?

"filter" : "tagfilter",
"filter_params" : {
"tag" : doc['field_name'].values
}

So the view returns documents with tags and probably some other data
relevant to display in the application when sorting documents by tag,
the tag param should then allow me to do a search with the param
tag=couchdb and if I have ["couchdb","elasticsearch","other tag"] as
the tags array in my document I should be able to return that
document.

Does this sound correct?

Can I actually do "tag" : doc['field_name'].values (or some variation
that gives the same result) in my filter_params or am I going to have
to go to scripting?

Thanks,
Bryan Rasmussen

On Sun, Jul 7, 2013 at 7:47 AM, bryan rasmussen
rasmussen.bryan@gmail.com wrote:

I think I found a way to do what I want, but I'd like some feedback
before I start - especially as it will help me developing some other
applications with elasticsearch as the search engine that I am working
on right now -

question

On Thu, Jul 4, 2013 at 7:20 PM, bryan rasmussen
rasmussen.bryan@gmail.com wrote:

David Pilato david@pilato.fr Jul 04 09:17AM +0200

If it's not the answer you were expected, could you update your question with >some
sample data and describe the result you want to have from that data?

My main problem is that the way to handle tags in couchdb to do stuff like

  1. return all documents that have tag1 (where a tag can be
    anydescripting text applied to a document by a user)

and

  1. return all documents that have tag1 + tag2 + tag3

and so forth

tend to be verbose and expensive.

To do that in Solr is quite easy, but snce the search is actually not
that important a part of the application I didn't want to go through
the trouble of setting up, maintaining Solr, instead I just used
elasticsearch couchdb river.

So I would like a way to handle the problem in elasticsearch with
hopefully not a lot of work :slight_smile:

tags in my documents are represented as arrays.

so solutions I might expect to be possible would be:

a way to write a query that allows me to search for a value inside of
an array item, by using a special parameter in the query.

or

have a view that does not do the common couchdb thing of outputting a
document per tag but solves the problem in some other way, and
indexing that view?

or

create a special query handler that only indexes items in the array,
so that each array[i] becomes

tag = value of array[i]

which is of course the same as couchdb way of having a view output new
document per actual tag in tags array, except the way elasticsearch
does it will not have the drawbacks of couchdb writing multiple
documents to disk.

Thanks,
Bryan Rasmussen

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

Sorry. I don't get it.

You want to create an array of values?
Search something in that array?

Right?

That's something Elasticsearch deals with. I don't see your concern.
A real example should help a lot.

About couchdb views, you can't index it right now. See https://github.com/elasticsearch/elasticsearch-river-couchdb/pull/24

--
David :wink:
Twitter : @dadoonet / @elasticsearchfr / @scrutmydocs

Le 7 juil. 2013 à 08:03, bryan rasmussen rasmussen.bryan@gmail.com a écrit :

oops, sent that last one accidentally

I think I found a way to do what I want, but I'd like some feedback
before I start - especially as it will help me developing some other
applications with elasticsearch as the search engine that I am working
on right now -

given that: "...elasticsearch can easily have several couchdb rivers
(and other types of rivers) running at the same time, all pointing to
different databases and indexing them into different indices (or the
same index, you choose) using the same elasticsearch cluster."

Can I index just a view in a river and then do something like the following?

"filter" : "tagfilter",
"filter_params" : {
"tag" : doc['field_name'].values
}

So the view returns documents with tags and probably some other data
relevant to display in the application when sorting documents by tag,
the tag param should then allow me to do a search with the param
tag=couchdb and if I have ["couchdb","elasticsearch","other tag"] as
the tags array in my document I should be able to return that
document.

Does this sound correct?

Can I actually do "tag" : doc['field_name'].values (or some variation
that gives the same result) in my filter_params or am I going to have
to go to scripting?

Thanks,
Bryan Rasmussen

On Sun, Jul 7, 2013 at 7:47 AM, bryan rasmussen
rasmussen.bryan@gmail.com wrote:

I think I found a way to do what I want, but I'd like some feedback
before I start - especially as it will help me developing some other
applications with elasticsearch as the search engine that I am working
on right now -

question

On Thu, Jul 4, 2013 at 7:20 PM, bryan rasmussen
rasmussen.bryan@gmail.com wrote:

David Pilato david@pilato.fr Jul 04 09:17AM +0200

If it's not the answer you were expected, could you update your question with >some
sample data and describe the result you want to have from that data?

My main problem is that the way to handle tags in couchdb to do stuff like

  1. return all documents that have tag1 (where a tag can be
    anydescripting text applied to a document by a user)

and

  1. return all documents that have tag1 + tag2 + tag3

and so forth

tend to be verbose and expensive.

To do that in Solr is quite easy, but snce the search is actually not
that important a part of the application I didn't want to go through
the trouble of setting up, maintaining Solr, instead I just used
elasticsearch couchdb river.

So I would like a way to handle the problem in elasticsearch with
hopefully not a lot of work :slight_smile:

tags in my documents are represented as arrays.

so solutions I might expect to be possible would be:

a way to write a query that allows me to search for a value inside of
an array item, by using a special parameter in the query.

or

have a view that does not do the common couchdb thing of outputting a
document per tag but solves the problem in some other way, and
indexing that view?

or

create a special query handler that only indexes items in the array,
so that each array[i] becomes

tag = value of array[i]

which is of course the same as couchdb way of having a view output new
document per actual tag in tags array, except the way elasticsearch
does it will not have the drawbacks of couchdb writing multiple
documents to disk.

Thanks,
Bryan Rasmussen

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

On Thursday, July 4, 2013 10:06:47 AM UTC+10, bryan wrote:

As per the subject, my documents in couchdb have a tags field which is an
array.

If your CouchDB documents have a tags array, and you're indexing the whole
document with a the CouchDB river in to Elasticsearch, then you don't need
to do anything else. For example, if your CouchDB document looks like this:

{
"_id": "1",
"tags" : ["tag1", "tag2", "tag3"]
}

Then you can easily find that document with a term query like this:

{
"query" : {
"term" : { "tags" : "tag2" }
}
}

Which will return all documents with "tag2" somewhere in the tags array. No
need to create special views or anything.

Basically, Elasticsearch doesn't care if your fields are array values. You
can search them regardless.
See http://www.elasticsearch.org/guide/reference/mapping/array-type/ for
more details.

Cheers,
Dan

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.