I have the following search:
{
"query": {
"filtered": {
"query": {
"query_string": {
"default_operator": "AND",
"query": "details:foo\-bar"
}
},
"filter": {
"term": {
"deleted": false
}
}
}
}
}
The details field is analyzed using pattern tokenizer, as so:
settings: {
index.analysis.analyzer.letterordigit.pattern: "[^\p{L}\p{N}]+",
index.analysis.analyzer.letterordigit.type: "pattern"
}
This breaks the field into tokens separated by any non-letter or
non-numeric character.
But the user is searching for "foo-bar" which contains a non alphanumeric
character. I assume, but correct me if I'm wrong, that ES will apply the
same analyzer to that string. So it is broken into two tokens: ["foo",
"bar"], and then the default_operator kicks in and essentially turns the
query into "details:foo AND detail:bar".
My problem is that it will match documents containing "foo xyz bar" and
"bar xyz foo" -- in the latter case, the tokens are in the reverse order
from the user's search. I'm fine with it matching the former, but it's a
stretch to convince the user that the latter is intended.
The search string is provided by the user, so I can't really build a
complex query with different query types, hence the basic querystring
search.
Any advice or corrections to my assumptions is appreciated!
--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/4a204214-f209-48dd-a13a-96463609ad7d%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.