Two-tiered filtering of results

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Hi all,

I'm working on a site which has a decent number of tagged documents
(~2m). ES indexes the tags (plus a bunch of other fields) and users
navigate the site using pages with size/from, nothing very fancy
there. ES is just storing the documents; everything else lives in
MongoDB (including the docs themselves, we just use ES for querying
and display - it's a Rails app using tire as an ES adapter).

We allow users to hide (completely remove) and obscure (they won't see
unless they click through) results by storing lists of tags on user
objects, and doing a one-by-one check (are any of $document's tags in
$user's obscured list, if so obscure, if not render normally) for
obscuring and a not terms filter for hiding documents.

We're now looking at letting users use complex hiding/obscuring -
using booleans, mainly - by storing queries (ie 'tag1 OR (tag2 AND
tag3)') on the user object and searching using that as a filter
instead of our current not terms filter, but when it comes to
obscuring we're having some issues working out how to map that to ES
queries.

Ideally we'd get the hidden-filtered, obscured-annotated list back
from ES somehow but I don't think (having read the docs) that it's
feasible to do this.

Am I wrong in thinking I'm going to have to make a second query with
the results from the first to see which docs match the obscured
documents query, then marry the results up in the application, or is
there a neater solution?


Cheers,
James Harrison
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.17 (MingW32)

iEYEARECAAYFAlFcpf8ACgkQ22kkGnnJQAzXLwCgp5MfClOpRPwbmv/dajpkiMDf
S3kAn18629NZXH3MLkSYzdN6B52iXUXe
=vd3i
-----END PGP SIGNATURE-----

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

How do you sort your documents?

On Wednesday, April 3, 2013 5:58:23 PM UTC-4, James Harrison wrote:

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Hi all,

I'm working on a site which has a decent number of tagged documents
(~2m). ES indexes the tags (plus a bunch of other fields) and users
navigate the site using pages with size/from, nothing very fancy
there. ES is just storing the documents; everything else lives in
MongoDB (including the docs themselves, we just use ES for querying
and display - it's a Rails app using tire as an ES adapter).

We allow users to hide (completely remove) and obscure (they won't see
unless they click through) results by storing lists of tags on user
objects, and doing a one-by-one check (are any of $document's tags in
$user's obscured list, if so obscure, if not render normally) for
obscuring and a not terms filter for hiding documents.

We're now looking at letting users use complex hiding/obscuring -
using booleans, mainly - by storing queries (ie 'tag1 OR (tag2 AND
tag3)') on the user object and searching using that as a filter
instead of our current not terms filter, but when it comes to
obscuring we're having some issues working out how to map that to ES
queries.

Ideally we'd get the hidden-filtered, obscured-annotated list back
from ES somehow but I don't think (having read the docs) that it's
feasible to do this.

Am I wrong in thinking I'm going to have to make a second query with
the results from the first to see which docs match the obscured
documents query, then marry the results up in the application, or is
there a neater solution?


Cheers,
James Harrison
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.17 (MingW32)

iEYEARECAAYFAlFcpf8ACgkQ22kkGnnJQAzXLwCgp5MfClOpRPwbmv/dajpkiMDf
S3kAn18629NZXH3MLkSYzdN6B52iXUXe
=vd3i
-----END PGP SIGNATURE-----

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Just a simple sort on an incrementing ID field (or another static
field), never score.

Cheers,
James Harrison

On 05/04/2013 15:04, Igor Motov wrote:

How do you sort your documents?

On Wednesday, April 3, 2013 5:58:23 PM UTC-4, James Harrison
wrote:

Hi all,

I'm working on a site which has a decent number of tagged
documents (~2m). ES indexes the tags (plus a bunch of other fields)
and users navigate the site using pages with size/from, nothing
very fancy there. ES is just storing the documents; everything else
lives in MongoDB (including the docs themselves, we just use ES for
querying and display - it's a Rails app using tire as an ES
adapter).

We allow users to hide (completely remove) and obscure (they won't
see unless they click through) results by storing lists of tags on
user objects, and doing a one-by-one check (are any of $document's
tags in $user's obscured list, if so obscure, if not render
normally) for obscuring and a not terms filter for hiding
documents.

We're now looking at letting users use complex hiding/obscuring -
using booleans, mainly - by storing queries (ie 'tag1 OR (tag2 AND
tag3)') on the user object and searching using that as a filter
instead of our current not terms filter, but when it comes to
obscuring we're having some issues working out how to map that to
ES queries.

Ideally we'd get the hidden-filtered, obscured-annotated list back
from ES somehow but I don't think (having read the docs) that it's
feasible to do this.

Am I wrong in thinking I'm going to have to make a second query
with the results from the first to see which docs match the
obscured documents query, then marry the results up in the
application, or is there a neater solution?

-- You received this message because you are subscribed to the
Google Groups "elasticsearch" group. To unsubscribe from this group
and stop receiving emails from it, send an email to
elasticsearch+unsubscribe@googlegroups.com. For more options, visit
https://groups.google.com/groups/opt_out.

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.17 (MingW32)

iEYEARECAAYFAlFf/k4ACgkQ22kkGnnJQAwwcwCeLdx+Ic1Genl/ed+Np5DAFcgY
oSAAmQHa6OPFoAr4w/gPMIDz7f99qEfL
=/umm
-----END PGP SIGNATURE-----

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

If you are not using score to sort, you can use it to figure out which
records match a certain filter. All you need to do is wrap your query into
constant_score_queryhttp://www.elasticsearch.org/guide/reference/query-dsl/constant-score-query/ to
ensure that your query by itself returns the same _score for all records.
Then you can wrap this constant_score_query into custom_filters_scorehttp://www.elasticsearch.org/guide/reference/query-dsl/custom-filters-score-query/query, express your show/hide logic as one of the filters with some boost
(let's use 2, for example) and set "track_scores": true. This way all
records that match your filter will have score of 2.0 and all recored that
don't match your filter will have score of 1.0.

On Saturday, April 6, 2013 6:51:58 AM UTC-4, James Harrison wrote:

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Just a simple sort on an incrementing ID field (or another static
field), never score.

Cheers,
James Harrison

On 05/04/2013 15:04, Igor Motov wrote:

How do you sort your documents?

On Wednesday, April 3, 2013 5:58:23 PM UTC-4, James Harrison
wrote:

Hi all,

I'm working on a site which has a decent number of tagged
documents (~2m). ES indexes the tags (plus a bunch of other fields)
and users navigate the site using pages with size/from, nothing
very fancy there. ES is just storing the documents; everything else
lives in MongoDB (including the docs themselves, we just use ES for
querying and display - it's a Rails app using tire as an ES
adapter).

We allow users to hide (completely remove) and obscure (they won't
see unless they click through) results by storing lists of tags on
user objects, and doing a one-by-one check (are any of $document's
tags in $user's obscured list, if so obscure, if not render
normally) for obscuring and a not terms filter for hiding
documents.

We're now looking at letting users use complex hiding/obscuring -
using booleans, mainly - by storing queries (ie 'tag1 OR (tag2 AND
tag3)') on the user object and searching using that as a filter
instead of our current not terms filter, but when it comes to
obscuring we're having some issues working out how to map that to
ES queries.

Ideally we'd get the hidden-filtered, obscured-annotated list back
from ES somehow but I don't think (having read the docs) that it's
feasible to do this.

Am I wrong in thinking I'm going to have to make a second query
with the results from the first to see which docs match the
obscured documents query, then marry the results up in the
application, or is there a neater solution?

-- You received this message because you are subscribed to the
Google Groups "elasticsearch" group. To unsubscribe from this group
and stop receiving emails from it, send an email to
elasticsearc...@googlegroups.com <javascript:>. For more options, visit
https://groups.google.com/groups/opt_out.

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.17 (MingW32)

iEYEARECAAYFAlFf/k4ACgkQ22kkGnnJQAwwcwCeLdx+Ic1Genl/ed+Np5DAFcgY
oSAAmQHa6OPFoAr4w/gPMIDz7f99qEfL
=/umm
-----END PGP SIGNATURE-----

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

That sounds very workable as an approach - I'll give it a shot. Thanks.

Cheers,
James Harrison

On 06/04/2013 14:20, Igor Motov wrote:

If you are not using score to sort, you can use it to figure out
which records match a certain filter. All you need to do is wrap
your query into constant_score_query
http://www.elasticsearch.org/guide/reference/query-dsl/constant-score-query/
to ensure that your query by itself returns the same _score for
all records. Then you can wrap this constant_score_query into
custom_filters_score
http://www.elasticsearch.org/guide/reference/query-dsl/custom-filters-score-query/

query, express your show/hide logic as one of the filters with some

boost (let's use 2, for example) and set "track_scores": true. This
way all records that match your filter will have score of 2.0 and
all recored that don't match your filter will have score of 1.0.

On Saturday, April 6, 2013 6:51:58 AM UTC-4, James Harrison wrote:

Just a simple sort on an incrementing ID field (or another static
field), never score.

Cheers, James Harrison

On 05/04/2013 15:04, Igor Motov wrote:

How do you sort your documents?

On Wednesday, April 3, 2013 5:58:23 PM UTC-4, James Harrison
wrote:

Hi all,

I'm working on a site which has a decent number of tagged
documents (~2m). ES indexes the tags (plus a bunch of other
fields) and users navigate the site using pages with size/from,
nothing very fancy there. ES is just storing the documents;
everything else lives in MongoDB (including the docs themselves,
we just use ES for querying and display - it's a Rails app using
tire as an ES adapter).

We allow users to hide (completely remove) and obscure (they
won't see unless they click through) results by storing lists of
tags on user objects, and doing a one-by-one check (are any of
$document's tags in $user's obscured list, if so obscure, if not
render normally) for obscuring and a not terms filter for hiding
documents.

We're now looking at letting users use complex hiding/obscuring

  • using booleans, mainly - by storing queries (ie 'tag1 OR (tag2
    AND tag3)') on the user object and searching using that as a
    filter instead of our current not terms filter, but when it comes
    to obscuring we're having some issues working out how to map that
    to ES queries.

Ideally we'd get the hidden-filtered, obscured-annotated list
back from ES somehow but I don't think (having read the docs)
that it's feasible to do this.

Am I wrong in thinking I'm going to have to make a second query
with the results from the first to see which docs match the
obscured documents query, then marry the results up in the
application, or is there a neater solution?

-- You received this message because you are subscribed to the
Google Groups "elasticsearch" group. To unsubscribe from this
group and stop receiving emails from it, send an email to
elasticsearc...@googlegroups.com <javascript:>. For more
options,
visit
https://groups.google.com/groups/opt_out
https://groups.google.com/groups/opt_out.

-- You received this message because you are subscribed to the
Google Groups "elasticsearch" group. To unsubscribe from this group
and stop receiving emails from it, send an email to
elasticsearch+unsubscribe@googlegroups.com. For more options, visit
https://groups.google.com/groups/opt_out.

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.17 (MingW32)

iEYEARECAAYFAlFgz7AACgkQ22kkGnnJQAwTlACZAcYkyAJ+6/S8aBII3oTUkGev
XFsAn3pduR9LhGtilXVjD/zNhfAHwnyx
=/mTG
-----END PGP SIGNATURE-----

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.