Is there a way to do a span query near a particular token index?
For example, is it possible to find all documents that mention "foo" within ten tokens of the start of the document? Or more generally, is it possible to find all documents that contain "foo" within X tokens of the Yth token in the document?
There is no span query that does it directly as it is a more common use-case to reason on relative positions of tokens. However, the information that is needed to find tokens near a certain absolute position is available in the Lucene index so it would be possible to implement such span queries.
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.