Nested faceted/term count query help

Hello

I am trying to develop a filtered tag/genre selection. Basically, I want to
show the user top genres and once selected, show other genres which are
also present with the selected genre.

E.g., I could have fields like:

genre1, genre2
genre2, genre3
genre4, genre1

So the top genres listed would be:

genre2, 2
genre1, 2
genre3, 1
genre4, 1

Now if the user selects genre2, I want to show the other ones that pair
with genre2:

genre1
genre3

... Since genre4 is never paired with genre2, we dont show it in the list.

Right now I am doing something like:

But in this case, I have to do:

1 + (top genres) queries.

Is there a way that I can get all the info in one fetch?

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

But in this case, I have to do:

1 + (top genres) queries.

Is there a way that I can get all the info in one fetch?

No. Currently we don't have hierarchical facets. It is on the todo list

clint

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

I found a superbly wonderful way to do this on the client (Java client)
side! First, I started with the excellent suggestion made by Lukáš Vlček at:

http://elasticsearch-users.115913.n3.nabble.com/facet-and-grouping-td4020055.html

I therefore implemented a follow-on client-side grouping hierarchy in Java
(not js as that suggestion used). Works beautifully!

Some hints:

The groupings are separated by the ~~~ string because that's what was
suggested and it works. So a quoted "~~~" Java Pattern was used to split
the resulting terms.

But how to handle the sorting? I keep the hierarchy in a map, but use a
Java LinkedHashMap instead of a simple HashMap. That way, every entry is
added in the same sorted order as returned by Elasticsearch. So the
top-level counts are in order, the child counts under them are in order
relative to the other children, and so on.

As an example, I loaded the US Census cities into an ES type with "state"
and "city" fields containing the state abbreviations and city names,
respectively. Here is the query:

{
"from" : 0,
"size" : 50000,
"query" : {
"match_all" : { }
},
"version" : true,
"explain" : false,
"facets" : {
"state_city_combinations" : {
"terms" : {
"size" : 100,
"script" : "doc['state'].value + "~~~" + doc['city'].value"
}
}
}
}

When generating the response:

  1. The resulting terms are lowercased and stemmed using the English
    snowball analyzer. So Elasticsearch creates citi for City, and so on. I
    suppose I could have an additional non-stemmed copy of that field, but for
    now this works fine for me. However, I would welcome any suggestions on the
    best approach to this.

  2. I read that it's Not Good to use a Terms facet on a field that is
    analyzed into multiple words. Like my "city" field. But it seems to work
    well enough. Again, any suggestions would be welcomed.

  3. I did not find an easy way to emit the TermsFacet response as JSON, so I
    wrote my own version to match what the HTTP REST interface generates. But
    then, as a child of the "terms" object, I added the "combinations" : { }object. I wanted the "combinations" but left in the "terms"
    : { } object for testing; no big deal.

  4. My Java code made only one pass through the terms. A tiny bit of
    recursion to add each child to its parent's LinkedHashMap (recursion depth
    was limited to total number of fields in the hierarchy: Typically very
    small.

And here is the response, which looks just like the HTTP REST response but
with the added "combinations" : { } object:

{
"facets" : {
"state_city_combinations" : {
"_type" : "terms",
"total" : 25376,
"other" : 24346,
"missing" : 0,
"combinations" : {
"fl" : {
"beach" : 61,
"lake" : 25,
"citi" : 21,
"estat" : 9,
"fort" : 6,
"bay" : 6,
"east" : 5,
"creek" : 4
},
"tx" : {
"citi" : 36,
"creek" : 10,
"oak" : 9,
"la" : 9,
"hill" : 8,
"bay" : 6,
"grove" : 5,
"spring" : 4,
"park" : 4,
"falcon" : 4,
"cross" : 4,
"acr" : 4
},
"mo" : {
"citi" : 36,
"lake" : 7,
"creek" : 4
},
"il" : {
"citi" : 29,
"hill" : 9,
"grove" : 9,
"lake" : 8
},
"mn" : {
"lake" : 27,
"fall" : 6,
"citi" : 4
},
"wi" : {
"lake" : 25,
"citi" : 8,
"fall" : 7
},
"ny" : {
"east" : 25,
"fall" : 16,
"lake" : 11,
"bay" : 7,
"north" : 6,
"new" : 5,
"harbor" : 5,
"beach" : 5,
"hill" : 4
},
"ia" : {
"citi" : 22,
"center" : 4
},
"ca" : {
"citi" : 21,
"beach" : 20,
"hill" : 17,
"lake" : 8,
"east" : 8
},
"pa" : {
"citi" : 19,
"east" : 17,
"hill" : 16,
"height" : 11,
"mount" : 7,
"new" : 4,
"beaver" : 4
},
"ok" : {
"citi" : 18,
"creek" : 8,
"grove" : 4,
"acr" : 4
},
"wa" : {
"lake" : 16,
"citi" : 7,
"east" : 4,
"creek" : 4
},
"oh" : {
"citi" : 16,
"hill" : 14,
"height" : 13,
"new" : 7,
"center" : 6,
"north" : 5,
"lake" : 4,
"fall" : 4
},
"mi" : {
"lake" : 16,
"citi" : 12
},
"or" : {
"citi" : 14
},
"ks" : {
"citi" : 13
},
"ak" : {
"bay" : 13
},
"ne" : {
"citi" : 11
},
"tn" : {
"citi" : 10,
"hill" : 7
},
"nj" : {
"citi" : 10,
"beach" : 9,
"lake" : 5,
"height" : 4
},
"in" : {
"citi" : 10,
"new" : 4
},
"ct" : {
"center" : 9
},
"ky" : {
"hill" : 8
},
"ga" : {
"citi" : 8
},
"ut" : {
"citi" : 6,
"lake" : 4
},
"sd" : {
"citi" : 6,
"lake" : 4
},
"nc" : {
"citi" : 6,
"beach" : 5
},
"al" : {
"citi" : 5
},
"sc" : {
"beach" : 4
},
"pr" : {
"la" : 4
},
"md" : {
"chase" : 4
},
"de" : {
"beach" : 4
}
},
"terms" : [ {
"term" : "fl~beach",
"count" : 61
}, {
"term" : "tx
~citi",
"count" : 36
}, {
"term" : "mo~citi",
"count" : 36
}, {
"term" : "il
~citi",
"count" : 29
}, {
"term" : "mn~lake",
"count" : 27
}, {
"term" : "wi
~lake",
"count" : 25
}, {
"term" : "ny~east",
"count" : 25
}, {
"term" : "fl
~lake",
"count" : 25
}, {
"term" : "ia~citi",
"count" : 22
}, {
"term" : "fl
~citi",
"count" : 21
}, {
"term" : "ca~citi",
"count" : 21
}, {
"term" : "ca
~beach",
"count" : 20
}, {
"term" : "pa~citi",
"count" : 19
}, {
"term" : "ok
~citi",
"count" : 18
}, {
"term" : "pa~east",
"count" : 17
}, {
"term" : "ca
~hill",
"count" : 17
}, {
"term" : "wa~lake",
"count" : 16
}, {
"term" : "pa
~hill",
"count" : 16
}, {
"term" : "oh~citi",
"count" : 16
}, {
"term" : "ny
~fall",
"count" : 16
}, {
"term" : "mi~lake",
"count" : 16
}, {
"term" : "or
~citi",
"count" : 14
}, {
"term" : "oh~hill",
"count" : 14
}, {
"term" : "oh
~height",
"count" : 13
}, {
"term" : "ks~citi",
"count" : 13
}, {
"term" : "ak
~bay",
"count" : 13
}, {
"term" : "mi~citi",
"count" : 12
}, {
"term" : "pa
~height",
"count" : 11
}, {
"term" : "ny~lake",
"count" : 11
}, {
"term" : "ne
~citi",
"count" : 11
}, {
"term" : "tx~creek",
"count" : 10
}, {
"term" : "tn
~citi",
"count" : 10
}, {
"term" : "nj~citi",
"count" : 10
}, {
"term" : "in
~citi",
"count" : 10
}, {
"term" : "tx~oak",
"count" : 9
}, {
"term" : "tx
~la",
"count" : 9
}, {
"term" : "nj~beach",
"count" : 9
}, {
"term" : "il
~hill",
"count" : 9
}, {
"term" : "il~grove",
"count" : 9
}, {
"term" : "fl
~estat",
"count" : 9
}, {
"term" : "ct~center",
"count" : 9
}, {
"term" : "wi
~citi",
"count" : 8
}, {
"term" : "tx~hill",
"count" : 8
}, {
"term" : "ok
~creek",
"count" : 8
}, {
"term" : "ky~hill",
"count" : 8
}, {
"term" : "il
~lake",
"count" : 8
}, {
"term" : "ga~citi",
"count" : 8
}, {
"term" : "ca
~lake",
"count" : 8
}, {
"term" : "ca~east",
"count" : 8
}, {
"term" : "wi
~fall",
"count" : 7
}, {
"term" : "wa~citi",
"count" : 7
}, {
"term" : "tn
~hill",
"count" : 7
}, {
"term" : "pa~mount",
"count" : 7
}, {
"term" : "oh
~new",
"count" : 7
}, {
"term" : "ny~bay",
"count" : 7
}, {
"term" : "mo
~lake",
"count" : 7
}, {
"term" : "ut~citi",
"count" : 6
}, {
"term" : "tx
~bay",
"count" : 6
}, {
"term" : "sd~citi",
"count" : 6
}, {
"term" : "oh
~center",
"count" : 6
}, {
"term" : "ny~north",
"count" : 6
}, {
"term" : "nc
~citi",
"count" : 6
}, {
"term" : "mn~fall",
"count" : 6
}, {
"term" : "fl
~fort",
"count" : 6
}, {
"term" : "fl~bay",
"count" : 6
}, {
"term" : "tx
~grove",
"count" : 5
}, {
"term" : "oh~north",
"count" : 5
}, {
"term" : "ny
~new",
"count" : 5
}, {
"term" : "ny~harbor",
"count" : 5
}, {
"term" : "ny
~beach",
"count" : 5
}, {
"term" : "nj~lake",
"count" : 5
}, {
"term" : "nc
~beach",
"count" : 5
}, {
"term" : "fl~east",
"count" : 5
}, {
"term" : "al
~citi",
"count" : 5
}, {
"term" : "wa~east",
"count" : 4
}, {
"term" : "wa
~creek",
"count" : 4
}, {
"term" : "ut~lake",
"count" : 4
}, {
"term" : "tx
~spring",
"count" : 4
}, {
"term" : "tx~park",
"count" : 4
}, {
"term" : "tx
~falcon",
"count" : 4
}, {
"term" : "tx~cross",
"count" : 4
}, {
"term" : "tx
~acr",
"count" : 4
}, {
"term" : "sd~lake",
"count" : 4
}, {
"term" : "sc
~beach",
"count" : 4
}, {
"term" : "pr~la",
"count" : 4
}, {
"term" : "pa
~new",
"count" : 4
}, {
"term" : "pa~beaver",
"count" : 4
}, {
"term" : "ok
~grove",
"count" : 4
}, {
"term" : "ok~acr",
"count" : 4
}, {
"term" : "oh
~lake",
"count" : 4
}, {
"term" : "oh~fall",
"count" : 4
}, {
"term" : "ny
~hill",
"count" : 4
}, {
"term" : "nj~height",
"count" : 4
}, {
"term" : "mo
~creek",
"count" : 4
}, {
"term" : "mn~citi",
"count" : 4
}, {
"term" : "md
~chase",
"count" : 4
}, {
"term" : "in~new",
"count" : 4
}, {
"term" : "ia
~center",
"count" : 4
}, {
"term" : "fl~creek",
"count" : 4
}, {
"term" : "de
~beach",
"count" : 4
} ]
}
}
}

On Wednesday, February 27, 2013 6:00:21 AM UTC-5, Clinton Gormley wrote:

But in this case, I have to do:

1 + (top genres) queries.

Is there a way that I can get all the info in one fetch?

No. Currently we don't have hierarchical facets. It is on the todo list

clint

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

Never tried this suggestion, but here is another solution :

On Wed, Feb 27, 2013 at 12:05 PM, InquiringMind brian.from.fl@gmail.comwrote:

I found a superbly wonderful way to do this on the client (Java client)
side! First, I started with the excellent suggestion made by Lukáš Vlček at:

http://elasticsearch-users.115913.n3.nabble.com/facet-and-grouping-td4020055.html

I therefore implemented a follow-on client-side grouping hierarchy in Java
(not js as that suggestion used). Works beautifully!

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.