Writing to new index using Spark with mapping disabled

tchu01 · February 6, 2020, 12:29am

Hi all,

I am using AWS EMR PySpark to write to AWS ElasticSearch (cluster version 6.3) using the elasticsearch-hadoop-7.5.2.jar from https://www.elastic.co/downloads/hadoop.

I am trying to write to a new index with mapping disabled (as mentioned here: https://www.elastic.co/guide/en/elasticsearch/reference/6.3/enabled.html).

Is this possible? None of the configuration (https://www.elastic.co/guide/en/elasticsearch/hadoop/6.3/configuration.html) seems to allow this.

Currently, I have to create the index first with mapping disabled, and then write to that index. However, it would be awesome to just automatically create the index with mapping disabled as part of the Spark write.

Looking forward to discuss,
Tim

james.baiera · February 6, 2020, 4:49pm

You could try defining your disabled mapping in an index template, which would be applied to the indices created by ES-Hadoop. I believe that shouldn't cause any problems with writing the data from Hadoop. If you encounter any issues with the connector rejecting that configuration for writing, please feel free to share them here and I can take a look at them.

As a cautionary bit of advice, you might encounter some issues if you plan to read the data back in with ES-Hadoop from an index without mappings. ES-Hadoop uses the mappings on the index to inform it how to deserialize fields, so you may get different types on the data than when it was originally written. Additionally, you would need to set es.read.unmapped.fields.ignore to false or else the connector will throw out fields that are not mapped in ES.

system · March 5, 2020, 4:49pm

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Elasticsearch on Spark - Index Mapping Elasticsearch es-hadoop	3	1038	July 6, 2017
Creating a mapping using hadoop configuration Elasticsearch es-hadoop	3	1158	July 6, 2017
(apache spark df).saveToES(elastic search) Elasticsearch es-hadoop	3	2061	March 26, 2017
Elasticsearch and spark Elasticsearch	7	1173	July 6, 2017
Elastcisearch Hadoop customized index mapping? Elasticsearch es-hadoop	2	610	June 27, 2018

Writing to new index using Spark with mapping disabled

Related topics