Composite key for creating elastic search Index

Sachin111us · September 12, 2017, 2:11pm

Hi,
I am working on HDFS to Elastic search integration via SparkSQL. I could able to read the csv data from HDFS and create the elastic search index. To create Elastic search Index ID I am using one of the unique column from the csv data. Now my requirement is Elastic search Index ID should be combination of 2 CSV columns. Does anybody aware how would I achieve this? I am using elasticsearch-spark library to create index. Below is the sample code.

SparkSession sparkSession = SparkSession.builder().config(config).getOrCreate();
SQLContext ctx = sparkSession.sqlContext();
HashMap<String, String> options = new HashMap<String, String>();
options.put("header", "true");
options.put("path", "hdfs://localhost:9000/test");
Dataset df = ctx.read().format("com.databricks.spark.csv").options(options).load();
JavaEsSparkSQL.saveToEs(df, "spark/test", ImmutableMap.of("es.mapping.id", "Id"));

Thanks
Sach

james.baiera · October 4, 2017, 3:23am

In this case, the best option would be to use a Spark transformation to create the composite ID as a field on each record, and tell the connector to use that new composite field as the document's ID.

system · November 1, 2017, 3:24am

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Providing index Id while indexing data from HDFS to elasticsearch Index Elasticsearch	1	351	July 6, 2017
Spark DataFrame -- Elastic Seach write _ID Elasticsearch es-hadoop	5	3142	April 9, 2017
Providing id for each document in Index while indexing data from HDFS to elasticsearch index Elasticsearch	1	332	July 6, 2017
Loading JSON documents to elasticsearch via es-spark connector Elasticsearch es-hadoop	5	1523	August 22, 2018
How to update documents using spark Elasticsearch es-hadoop	2	1546	December 10, 2016

Composite key for creating elastic search Index

Related topics