I need to return documents that match at least N words in the same sentence.
I split my documents per sentence and index each one as a separate value like so:
PUT /test_index/_doc/id1
{
"texts": [
"Your first step is the subject line.",
"You will have just seconds to gain the full attention of your reader."]
}
and leave the position_increment_gap to the default 100.
Let say I need to match a minimum of 2 words.
I need to return the document if I search for the terms ("bla", "attention", "reader") but not for ("bla", "subject", "reader"). "bla" is not in the document, "attention" and "reader" are on the same sentence, "subject" and "reader" are not.
The approach with a boolean should query and minimum_should_match does not work, as this query returns the document when it shouldn't:
"query" : {
"bool": {
"should": [
{"term": {"texts": "subject"}},
{"term": {"texts": "reader"}}
],
"minimum_should_match": 2
}
}
So I need a way to mix proximity and minimum should match.
Is there a way to achieve that?