I have a use case where I'd like to be able to use the mini language , or at least the boolean operator parser of the query string query on a document level or a sentence level. I can define sentences easily enough using a large position_increment_gap between these at index time, and thus constrain phrase queries with a slop to be within the confines of a single sentence, I can then search for co occurrence using a span query.
But I'd like to be able to use the parser logic for boolean operators that lucene provides , so we can do richer queries such as
man OR person
X AND ( Y OR Z)
and have that only hit if the matches are all within the same sentence.
I was hoping to avoid having to write a parser that took the lucene mini language syntax and converted it to span queries.
Or perhaps there's a better way to achieve this with existing techniques?
I wondered whether there was a clause that I'm missing that would constrain all hit tokens to be within a certain distance of each other. I'm talking not about slop within a phrase query, but a global slop over all tokens matched. That way all could be forced to be within a sentence.
Grateful for any suggestions !