I'm not familiar with tokenizer, but Table 3. Word_Break Property Values in the latter page says, U+002E ( . ) FULL STOP is treated as a word boundary for MidNumLet but not for MidLetter. I have no idea about the reason.
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.