How to search on number of nested terms matches?


(Colin Surprenant) #1

I am trying to figure if/how it is possible to craft a specific query using
nested objects:

For example, given a simple author with nested books mapping:

{
"author":{
"properties" : {
"name" : { "type" : "string" },
"books" : {
"type" : "nested",
"properties" : {
"title" : { "type" : "string" },
"category" : { "type" : "string", "index" : "not_analyzed" }
}
}
}
}
}

Is it possible to craft queries to answer these kind of questions:

    • "how many authors wrote books in N specific categories" (ie how many
      wrote in both "travel" and "nonfiction")
    • "how many authors wrote books in exactly N different categories" (ie
      how many wrote in 2 different categorites whichever they are)
    • "how many authors wrote books in N or more different categories"

and more generally:

    • "what is the distribution of authors that wrote books in only 1
      category, in exacly 2 different categories, ..., N different categories"

Given a query for 2) we could express 4) programmatically by iterating for
1, 2, ..., N

For 1) this is working for me:

{
"query" : {
"filtered" : {
"query" : {
"match_all": {}
},
"filter" : {
"and" : [
{
"nested" : {
"path" : "books",
"query" : {
"filtered" : {
"filter" : {
"term" : {
"books.category" : "travel"
}
}
}
}
}
},
{
"nested" : {
"path" : "books",
"query" : {
"filtered" : {
"filter" : {
"term" : {
"books.category" : "nonfiction"
}
}
}
}
}
}
]
}
}
}
}

Any ideas on how we could approach this for 2) 3) and 4) ?

Thanks,
Colin

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/3fba046c-98b4-4b46-9232-7058fd7ce6c8%40googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.


(Binh Ly) #2

I can't think of a way this can be done at the moment (unless of course the
categories are finite and you can build a massive query using combinations
of them). However, you can always precompute the distinct category count
per author prior to indexing and then include it as an extra field in the
document and then filter it using say a range filter.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/f8230ff4-e5a8-43af-91d8-d630d1422aa3%40googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.


(Colin Surprenant) #3

The number of categories is finite and relatively low count. As you are
suggesting, querying for all combinations is an option as well as
precomputing. I wanted to see if there was a way to do it efficiently at
query time.

Thanks,
Colin

On Wednesday, February 12, 2014 8:36:44 AM UTC-5, Binh Ly wrote:

I can't think of a way this can be done at the moment (unless of course
the categories are finite and you can build a massive query using
combinations of them). However, you can always precompute the distinct
category count per author prior to indexing and then include it as an extra
field in the document and then filter it using say a range filter.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/ed343c60-c04a-4f09-b3b7-124d8f9c2f26%40googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.


(system) #4