This is what I had, this can query for phrases but they will always be in any order
{'query': {'bool': {'should': [{'bool': {'must': [{'match_phrase': {'data': 'word1'}}, {'match_phrase': {'data': 'word2'}}]}}]}}, 'from': 0, 'size': 9000}
The results returned by the above are:
bla bla word1 word2 bla
X Y word2 word1 bla bla
MNO bla bla word1 word2
ABC word1 word2 bla bla
The results I want for a given query of (word1,word2) are:
bla bla word1 word2 bla
ABC word1 word2 bla bla
MNO bla bla word1 word2
Similarly, for a query of (word2,word1), I want the results:
X Y word2 word1 bla bla
Can someone please tell me how to fix this query to include order of sequences
Hi,
You may want to look at Intervals Queries . One possible way to rewrite the query you provided is as follows:
GET my_index/_search
{
"query": {
"intervals": {
"data": {
"match": {
"ordered": "true",
"query": "word1 word2"
}
}
}
}
}
There are probably a few other ways to approach using word order in queries, so let me know if this doesn't solve your problem.
-William
1 Like
Hi William,
Thank you for your prompt response.
I'm using the python API. If I do a search of this form:
body = {
"query": {
"intervals": {
"data": {
"match": {
"ordered": "true",
"query": "word1 word2"
}
}
}
}
}
result = es.search(index=index, body=body)
I get an error:
RequestError(400, 'parsing_exception', 'no [query] registered for [intervals]')
Apparently, match_phrase is a solution but according to this article this is not the case:
You understanding is correct. The former will be translated into a Lucene
phrase query, which uses the term doc positions to find matches.
Both query terms are analyzed, but the latter will simply be a bag-of-words
query, which ignores positions.
Cheers,
Ivan
On Apr 14, 2015 10:38 PM, "Dave Reed" infinity88@gmail.com wrote:
To perhaps answer my own question, I think I understand the difference.
details:"foo bar"
Would search for the tokens in the same order (implied by the docs I
ref…
body = {
"query": {
"multi_match" : {
"query": "word1 word2",
"fields": ["data"],
"type": "phrase",
"slop": 9999
}
}
}
Just tried this, can confirm, that it gives same results regardless of order of words considered.
Intervals query was introduced in Elasticsearch 7.0 I believe. Which version are you using?
If it's a version thing, I think I could be using an older one, any way I could check?
Running curl localhost:9200
should give you the version number.
If you have the URL of your elasticsearch instance, you can curl
it or enter it in a browser to see version information. You should see something like this:
{
"name" : "...",
"cluster_name" : "...",
"cluster_uuid" : "...",
"version" : {
"number" : "7.2.0",
"build_flavor" : "default",
"build_type" : "tar",
"build_hash" : "508c38a",
"build_date" : "2019-06-20T15:54:18.811730Z",
"build_snapshot" : false,
"lucene_version" : "8.0.0",
"minimum_wire_compatibility_version" : "6.8.0",
"minimum_index_compatibility_version" : "6.0.0-beta1"
},
"tagline" : "You Know, for Search"
}
And I apologize for not mentioning that Intervals queries are a somewhat new feature.
system
(system)
Closed
November 11, 2019, 8:29pm
10
This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.