Query all fields in an embedded document?


(Nick Hoffman) #1

Is there a way to query all of the fields in an embedded document?

For example, if you have documents that're structured like this:
{ first_name: "Bob", interests: [ {name: "dogs", level: 5}, {topic: "dogs",
degree: "strong"} ] }

Can you build a query that searches all documents in the "interests" field
whose value is "dogs"?


(arien) #2

http://www.elasticsearch.org/guide/reference/api/admin-indices-aliases.html
aliases on indices using filters may help you.

On Mon, Nov 7, 2011 at 2:08 AM, Nick Hoffman nick@deadorange.com wrote:

Is there a way to query all of the fields in an embedded document?

For example, if you have documents that're structured like this:
{ first_name: "Bob", interests: [ {name: "dogs", level: 5}, {topic:
"dogs", degree: "strong"} ] }

Can you build a query that searches all documents in the "interests" field
whose value is "dogs"?


(Nick Hoffman) #3

Interesting. Thanks, arien. An index alias will work if you know the name
of every field. Unfortunately, the fields that I'm dealing with are
arbitrary.


(Clinton Gormley) #4

Hi Nick

On Sun, 2011-11-06 at 12:38 -0800, Nick Hoffman wrote:

Is there a way to query all of the fields in an embedded document?

For example, if you have documents that're structured like this:
{ first_name: "Bob", interests: [ {name: "dogs", level: 5}, {topic:
"dogs", degree: "strong"} ] }

Can you build a query that searches all documents in the "interests"
field whose value is "dogs"?

You want to look at nested fields, and nested queries/filters

http://www.elasticsearch.org/guide/reference/mapping/nested-type.html
http://www.elasticsearch.org/guide/reference/query-dsl/nested-query.html
http://www.elasticsearch.org/guide/reference/query-dsl/nested-filter.html


(Nick Hoffman) #5

Thanks for the push in the right direction, Clint. I really appreciate it.
After some reading and research, I found a basic example posted by kimchy:

I modified that example to fit the problem that I'm trying to
solve. Unfortunately, I couldn't figure out how to search for a value
across all of the nested fields without referencing each nested field
individually. Is this possible?

Here's a gist of what I've got so far. Note that the query at the end
returns no results. I'm not sure how to work _all into it, or if that's
even possible.

I tried these, but they all resulted in an invalid query:
"must": [ { "_all": { "comments._all": "this is text" } } ]
"must": [ { "_all": { "comments.": "this is text" } } ]
"must": [ { "
": { "comments._all": "this is text" } } ]
"must": [ { "": { "comments.": "this is text" } } ]

Cheers,
Nick


(vineeth mohan) #6

I have also discovered the same rabbit hole .
But then seeing that there is a new document created for each name value
pair , scares me a lot.
As number of such name value pair that can come is not limited and can
shoot to any big number.
Is there a way i can use array type to create name value pairs and
search/facet using them ?

Is there a better solution here ??

On Tue, Nov 8, 2011 at 10:42 AM, Nick Hoffman nick@deadorange.com wrote:

Thanks for the push in the right direction, Clint. I really appreciate it.
After some reading and research, I found a basic example posted by kimchy:
https://gist.github.com/1108683

I modified that example to fit the problem that I'm trying to
solve. Unfortunately, I couldn't figure out how to search for a value
across all of the nested fields without referencing each nested field
individually. Is this possible?

Here's a gist of what I've got so far. Note that the query at the end
returns no results. I'm not sure how to work _all into it, or if that's
even possible.
https://gist.github.com/0b62966c116fd6eb6212

I tried these, but they all resulted in an invalid query:
"must": [ { "_all": { "comments._all": "this is text" } } ]
"must": [ { "_all": { "comments.": "this is text" } } ]
"must": [ { "
": { "comments._all": "this is text" } } ]
"must": [ { "": { "comments.": "this is text" } } ]

Cheers,
Nick


(Clinton Gormley) #7

Hi Nick

I modified that example to fit the problem that I'm trying to solve.
Unfortunately, I couldn't figure out how to search for a value across
all of the nested fields without referencing each nested field
individually. Is this possible?

As an example, I'm going to use this doc:

{ text: "foo",
attrib: [
{ color: "red", active: true },
{ color: "blue", active: false }
]
}

If 'attrib' is field-type 'object', then internally, this doc would look
something like this:

{ text: ["foo"],
attrib.color: ["red","blue"],
attrib.active: [true, false]
}

If 'attrib' is field-type 'nested', then internally, this doc would be
stored as 3 separate docs, something like this:

{ text: ["foo"] }
{ color: ["red"], active: true},
{ color: ["blue"], active: false},

...plus something to tie docs 2 & 3 to the main doc.

Now, consider this query:

color == 'red' and active == false

In the first (object type) example, the values for the 'attrib' are
flattened out, so the above clause will be true.

In the second (nested type) example, each doc is considered separately,
so the above clause will be false.

So you should use the 'nested' field-type only when you need to run
queries/filters that depend on the each sub-object being considered
separately.

Further configuration:

By default, the nested properties are not visible in the parent objects.
By which I mean: a query for 'attrib.color' will only work as a nested
query/filter.

However, if you set 'include_in_parent' (ie direct parent object) or
'include_in_root' (topmost object) then the nested values will be copied
into the parent or root object, in the same way as demonstrated in the
first example.

Only the root object has an _all field. By default, values in the
nested objects ARE included in the _all field.

So: two choices

  1. query the topmost _all field
  2. use a nested query with a bool or dismax query to query each
    field in your nested doc.

For instance, using your example docs:

QUERY THE _ALL FIELD:

curl -XGET 'http://127.0.0.1:9200/test/tweet/_search?pretty=1' -d '
{
"query" : {
"text" : {
"_all" : "jack stuff"
}
}
}
'

NESTED QUERY OF ALL FIELDS:

curl -XGET 'http://127.0.0.1:9200/test/tweet/_search?pretty=1' -d '
{
"query" : {
"nested" : {
"query" : {
"bool" : {
"should" : [
{
"text" : {
"comments.username" : "jack stuff"
}
},
{
"text" : {
"comments.content" : "jack stuff"
}
}
]
}
},
"path" : "comments"
}
}
}
'

ALSO NOTE: In your example, you are querying 'this', which is a
stopword, and would thus never return any results.

clint


(Shay Banon) #8

So, you are after searching all fields within a specific nested object
element, you can't do that without actually specifying those fields. The
_all field can help, but it only has one "aspect" to it, and you can
include / exclude fields from all "once". Nested docs will not help, since
they do not have _all field for each nested doc.

On Tue, Nov 8, 2011 at 7:12 AM, Nick Hoffman nick@deadorange.com wrote:

Thanks for the push in the right direction, Clint. I really appreciate it.
After some reading and research, I found a basic example posted by kimchy:
https://gist.github.com/1108683

I modified that example to fit the problem that I'm trying to
solve. Unfortunately, I couldn't figure out how to search for a value
across all of the nested fields without referencing each nested field
individually. Is this possible?

Here's a gist of what I've got so far. Note that the query at the end
returns no results. I'm not sure how to work _all into it, or if that's
even possible.
https://gist.github.com/0b62966c116fd6eb6212

I tried these, but they all resulted in an invalid query:
"must": [ { "_all": { "comments._all": "this is text" } } ]
"must": [ { "_all": { "comments.": "this is text" } } ]
"must": [ { "
": { "comments._all": "this is text" } } ]
"must": [ { "": { "comments.": "this is text" } } ]

Cheers,
Nick


(Nick Hoffman) #9

Wow, Clint. That was one heck of an explanation. Thank you! I really
appreciate your help.

Kimchy, your concise response was very helpful, too.

I have a much better understanding of ES and nested fields now, thanks to
you guys. If you ever need help with MongoDB, Mongoid, Ruby, or Rails, just
holler!


(my3sons) #10

Hi Guys,

So based on Nick's example, if he also wanted to query on "name" field of the root object (i.e. where name="jane"), while still including the nested query on "comments", is that possible? If so, how should that query be constructed. I am facing similar problem and have yet to find a solution.

Thanks!


(system) #11