Search filter in queries and not facets


(Sebastian Gavarini) #1

I was taking a look at issue 650:
https://github.com/elasticsearch/elasticsearch/issues/650 and also a
previous related one 499:
https://github.com/elasticsearch/elasticsearch/issues/499 but I think it's
not enough for a use case I am having and I would like to listen some
opinions, there is always the possibility that I am understanding it all
wrong :slight_smile:

As the example in 650 showed, it works fine for a single facet, "tag", but
let's expand the example to two facets at the same time, for example a new
field called "date", with values {today, yesterday}, I gisted it here

Continuing with the example, I would like to search for:

message=something, tag=blue, date=yesterday

Faceting by tag and date.

I would have to use a query with term message=something, include an and
"filter" with tag=blue, and date=yesterday filters, so far so good.

The problem is that for the "tag" facet I would have to include a
"facet_filter" with a filtered query of "message=something" and
"date=yesterday", effectively all the queries plus filters except the one
used in this facet. Similar for "date" facet, I would have to include a
"facet_filter" with a filtered query of "message=something" and "tag=blue".

This gets exponential with many facets at the same time, I am even scared to
ask how is the parsing and execution done.

I don't know if it's even doable, but I was thinking in something like an
integrated faceting-filter, that joins both components. It would facet
restricted by every other query+filter (but not his own filter value), and
would add a filter to reduce the hit results.

As a concrete example to see a real use case, go to Amazon:

Yes, an ugly link, but see how the faceting is done in the left, I clicked
in "Under 1 GB" and "Line-in Recording", both facets at the same time, but
the count of "Capacity", where "Under 1 GB" belongs, are kept independently
of each other, but in relation with the other facet "Line-in Recording" and
also my text query "mp3".

Sorry for the long post, any ideas would be great,
Sebastian.


(Shay Banon) #2

Why would you need to pass a facet_filter that includes the message=something? You will already get only hits for that since its your query.

If you want the facet selection to be applied to a specific facet, you add it as a facet filter. If you don't, you don't.

This allows for great flexibility. Where you can slice and dice the facets and decide which ones show the count of what (think also of a case with a navigation bar).

If you still have problems, post your curl understanding of how to do it, and we can continue from there.
On Tuesday, March 22, 2011 at 7:38 AM, Sebastian Gavarini wrote:

I was taking a look at issue 650: https://github.com/elasticsearch/elasticsearch/issues/650 and also a previous related one 499: https://github.com/elasticsearch/elasticsearch/issues/499 but I think it's not enough for a use case I am having and I would like to listen some opinions, there is always the possibility that I am understanding it all wrong :slight_smile:

As the example in 650 showed, it works fine for a single facet, "tag", but let's expand the example to two facets at the same time, for example a new field called "date", with values {today, yesterday}, I gisted it here https://gist.github.com/880814

Continuing with the example, I would like to search for:

message=something, tag=blue, date=yesterday

Faceting by tag and date.

I would have to use a query with term message=something, include an and "filter" with tag=blue, and date=yesterday filters, so far so good.

The problem is that for the "tag" facet I would have to include a "facet_filter" with a filtered query of "message=something" and "date=yesterday", effectively all the queries plus filters except the one used in this facet. Similar for "date" facet, I would have to include a "facet_filter" with a filtered query of "message=something" and "tag=blue".

This gets exponential with many facets at the same time, I am even scared to ask how is the parsing and execution done.

I don't know if it's even doable, but I was thinking in something like an integrated faceting-filter, that joins both components. It would facet restricted by every other query+filter (but not his own filter value), and would add a filter to reduce the hit results.

As a concrete example to see a real use case, go to Amazon: http://www.amazon.com/gp/search/ref=sr_nr_scat_1264866011_ln?rh=n%3A1264866011%2Ck%3Amp3&keywords=mp3&ie=UTF8&qid=1300769979&scn=1264866011&h=492973c59cda11820161a14b6b76464c95673c62#/ref=sr_nr_p_n_feature_four_bro_1?rh=n%3A172282%2Cn%3A!493964%2Cn%3A172623%2Cn%3A172630%2Cn%3A1264866011%2Ck%3Amp3%2Cp_n_feature_five_browse-bin%3A2204843011%2Cp_n_feature_four_browse-bin%3A676331011&bbn=1264866011&keywords=mp3&ie=UTF8&qid=1300770039

Yes, an ugly link, but see how the faceting is done in the left, I clicked in "Under 1 GB" and "Line-in Recording", both facets at the same time, but the count of "Capacity", where "Under 1 GB" belongs, are kept independently of each other, but in relation with the other facet "Line-in Recording" and also my text query "mp3".

Sorry for the long post, any ideas would be great,
Sebastian.


(Sebastian Gavarini) #3

Hi Shay,

I can see it now, there's no need for the query again, I got it wrong at
first. It's a very flexible way to do it actually.

Thanks,
Sebastian.

On Tue, Mar 22, 2011 at 5:33 AM, Shay Banon shay.banon@elasticsearch.comwrote:

Why would you need to pass a facet_filter that includes the
message=something? You will already get only hits for that since its your
query.

If you want the facet selection to be applied to a specific facet, you add
it as a facet filter. If you don't, you don't.

This allows for great flexibility. Where you can slice and dice the facets
and decide which ones show the count of what (think also of a case with a
navigation bar).

If you still have problems, post your curl understanding of how to do it,
and we can continue from there.

On Tuesday, March 22, 2011 at 7:38 AM, Sebastian Gavarini wrote:

I was taking a look at issue 650:
https://github.com/elasticsearch/elasticsearch/issues/650 and also a
previous related one 499:
https://github.com/elasticsearch/elasticsearch/issues/499 but I think it's
not enough for a use case I am having and I would like to listen some
opinions, there is always the possibility that I am understanding it all
wrong :slight_smile:

As the example in 650 showed, it works fine for a single facet, "tag", but
let's expand the example to two facets at the same time, for example a new
field called "date", with values {today, yesterday}, I gisted it here
https://gist.github.com/880814

Continuing with the example, I would like to search for:

message=something, tag=blue, date=yesterday

Faceting by tag and date.

I would have to use a query with term message=something, include an and
"filter" with tag=blue, and date=yesterday filters, so far so good.

The problem is that for the "tag" facet I would have to include a
"facet_filter" with a filtered query of "message=something" and
"date=yesterday", effectively all the queries plus filters except the one
used in this facet. Similar for "date" facet, I would have to include a
"facet_filter" with a filtered query of "message=something" and "tag=blue".

This gets exponential with many facets at the same time, I am even scared
to ask how is the parsing and execution done.

I don't know if it's even doable, but I was thinking in something like an
integrated faceting-filter, that joins both components. It would facet
restricted by every other query+filter (but not his own filter value), and
would add a filter to reduce the hit results.

As a concrete example to see a real use case, go to Amazon:
http://www.amazon.com/gp/search/ref=sr_nr_scat_1264866011_ln?rh=n%3A1264866011%2Ck%3Amp3&keywords=mp3&ie=UTF8&qid=1300769979&scn=1264866011&h=492973c59cda11820161a14b6b76464c95673c62#/ref=sr_nr_p_n_feature_four_bro_1?rh=n%3A172282%2Cn%3A!493964%2Cn%3A172623%2Cn%3A172630%2Cn%3A1264866011%2Ck%3Amp3%2Cp_n_feature_five_browse-bin%3A2204843011%2Cp_n_feature_four_browse-bin%3A676331011&bbn=1264866011&keywords=mp3&ie=UTF8&qid=1300770039

Yes, an ugly link, but see how the faceting is done in the left, I clicked
in "Under 1 GB" and "Line-in Recording", both facets at the same time, but
the count of "Capacity", where "Under 1 GB" belongs, are kept independently
of each other, but in relation with the other facet "Line-in Recording" and
also my text query "mp3".

Sorry for the long post, any ideas would be great,
Sebastian.


(system) #4