Having some problems with queries containing span_not. I've simplified the
query down to a test example however the query returns additional documents
I don't think you be returned.
In short I want to find the documents that contain 'foo' but not 'bar' from:
foo
foo bar
bar foo
foo foo bar
foo bar foo
bar foo foo
The below query returns two docs ('foo' and 'bar foo foo') rather than the
one I was expecting:
{
"query": {
"span_not": {
"include": {
"span_term": {
"field1": "foo"
}
},
"exclude": {
"span_near": {
"in_order": false,
"clauses": [
{
"span_term": {
"field1": "bar"
}
},
{
"span_term": {
"field1": "foo"
}
}
],
"slop": 1000
}
}
}
}
}
Why does 'bar foo foo' match the query, and given that it does, why don't
any of the others given in_order is false?
Tested on elasticsearch 1.0.1 and 1.1.1 on Ububtu 12.04.
My guess is that Lucene is matching on the second or third "foo" since
"bar" does not appear in its span. That said, that means the 4th document
should have match as well. Haven't actually run the query, but will try
later.
Having some problems with queries containing span_not. I've simplified the
query down to a test example however the query returns additional documents
I don't think you be returned.
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.