Is there any way how to apply lowercase token filter and preserve original tokens too?
Our goal is to be able to search terms case-insensitive (by Span Term and lowercased text) together with case-sensitive search (by Span Term and correct text with appropriate upper case characters - useful for example for abbreviations, company names, etc.) in the same for example Span Near Query.
The lowercase filter does not allow to do that. Besides such a approach would raise issues with term statistics. There is no other way to do what you want right now, but maybe it would be in the future if you indexed two fields (with a multi-fields) and then used Lucene's FieldMaskingSpanQuery to be able to build a SpanNearQuery across two fields. For it to work, we would need to expose FieldMaskingSpanQuery in elasticsearch first.
Multifields cannot be used in our use-case because it is not possible to combine different multifields in one Span Query (that's what FieldMaskingSpanQuery will solve as Adrien wrote above).
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.