Excluding multiple documents from query


(bruskie24) #1

I'm trying to use a Boolean query to exclude multiple documents from my
result set using the must_not operator, when I specify a single _id using
the term facet, it works just fine, but when I try more than one _id with
either the term or terms facet, it doesn't display back correctly. Is there
another way of doing this?

My original (working) query is in the form of:
{
"query": {
"bool": {
"must": [
{
"term": {
"content" : "data mining"
}
}
],
"must_not": [
{
"term": {
"_id": "1234"
}
}
]
}
}
}

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/9e983990-3c5d-426b-b23d-7459837dfc12%40googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.


(Ivan Brusic) #2

Try using a ids filter combined with a not filter instead;
http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/query-dsl-ids-filter.html

Filters are faster, cacheable (if desired) and will not influence scoring.
If you do not want the excluded ids to be added to the facet counts, use
the filter as part of a filtered query.

--
Ivan

On Mon, Feb 3, 2014 at 5:38 AM, James Massey jmassey@clearedgeit.comwrote:

I'm trying to use a Boolean query to exclude multiple documents from my
result set using the must_not operator, when I specify a single _id using
the term facet, it works just fine, but when I try more than one _id with
either the term or terms facet, it doesn't display back correctly. Is there
another way of doing this?

My original (working) query is in the form of:
{
"query": {
"bool": {
"must": [
{
"term": {
"content" : "data mining"
}
}
],
"must_not": [
{
"term": {
"_id": "1234"
}
}
]
}
}
}

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/9e983990-3c5d-426b-b23d-7459837dfc12%40googlegroups.com
.
For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CALY%3DcQD%2BMSqGOSDzSBVjenmgHpdqMiPkj6dKnK9vLoXkHBM9%2Bg%40mail.gmail.com.
For more options, visit https://groups.google.com/groups/opt_out.


(Sloan Ahrens) #3

Have you tried the ids filter?

I created a simple index with some data as follows:

curl -XPUT "http://localhost:9200/test_index/"

curl -XPUT "http://localhost:9200/test_index/_bulk" -d'
{"index":{"_index":"test_index","_type":"docs","_id":1}}
{"content": "abc"}
{"index":{"_index":"test_index","_type":"docs","_id":2}}
{"content": "abc"}
{"index":{"_index":"test_index","_type":"docs","_id":3}}
{"content": "abc"}
{"index":{"_index":"test_index","_type":"docs","_id":4}}
{"content": "xyz"}
'

and then queried like this:

curl -XPOST "http://localhost:9200/test_index/_search" -d'
{
   "query": {
      "bool": {
         "must": [
            {
               "term": {
                  "content": "abc"
               }
            }
         ],
         "must_not": [
            {
                "ids": {
                    "values": ["1", "2"]
                }
            }
         ]
      }
   }
}'

and got back the result I expected:

{
   "took": 3,
   "timed_out": false,
   "_shards": {
      "total": 2,
      "successful": 2,
      "failed": 0
   },
   "hits": {
      "total": 1,
      "max_score": 0.5945348,
      "hits": [
         {
            "_index": "test_index",
            "_type": "docs",
            "_id": "3",
            "_score": 0.5945348,
            "_source": {
               "content": "abc"
            }
         }
      ]
   }
}

Here is a runnable example you can play with if you want:

http://sense.qbox.io/gist/d0ef08e3c5b94706f85ede9ea95229e315dbfa6b


(bruskie24) #4

That worked, thanks everyone!


(system) #5