I suspect you are hitting this issue. The file input reads lines, and splitting the data into lines happens before the codec (and encoding) is applied, so it cannot properly process 16 bit characters.
I suspect you are hitting this issue. The file input reads lines, and splitting the data into lines happens before the codec (and encoding) is applied, so it cannot properly process 16 bit characters.
© 2020. All Rights Reserved - Elasticsearch
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant logo are trademarks of the Apache Software Foundation in the United States and/or other countries.