You can also try creating a custom analyser that handles this the way you want, e.g. first use a pattern replace token filter to replace any full stops (maybe also commas and other special characters?) followed by whitespace with just a white space and then apply a whitespace tokenizer. That should solve the issue you are seeing now but may require some tweaking to handle other scenarios/edge cases.
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.