I am new to elastic-search + apache-spark combination. I have a question on how I would be able to set datatypes in ElasticSearch for a dataframe I would be saving?
For example, when I take a dataframe to save, some columns gets saved as 'text' type. I would like them to be saved as 'keyword' type. Also, would I be able to update the mappings before the dataframe is written into index or it can only be done after an write operation?
@Muthu_Jayakumar ES-Hadoop defers to Elasticsearch for automatically mapping fields. This means that string data is going to be automatically set to text, dates will also be set to text types, and so on.
If these mappings are not what you expect, then you must create the index and mappings on your own before sending documents with ES-Hadoop/Spark. Another simple way to handle this if you are working with multiple indices is to use index templates in Elasticsearch.
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.