Generating custom _id when exporting from hadoop/spark to ES

Oriol_Lopez_Massague · June 29, 2022, 2:47pm

I want to insert data in ES using hadoop/spark but with a custom made _id field (not the autogerated one). I have seen that there is method using RDDs "saveToEsWithMeta":

github.com

elastic/elasticsearch-hadoop/blob/969eb3eb2dcff50174ae563e345fde8a4d605316/spark/core/src/main/scala/org/elasticsearch/spark/rdd/EsSpark.scala

/*
 * Licensed to Elasticsearch under one or more contributor
 * license agreements. See the NOTICE file distributed with
 * this work for additional information regarding copyright
 * ownership. Elasticsearch licenses this file to you under
 * the Apache License, Version 2.0 (the "License"); you may
 * not use this file except in compliance with the License.
 * You may obtain a copy of the License at
 *
 *    http://www.apache.org/licenses/LICENSE-2.0
 *
 * Unless required by applicable law or agreed to in writing,
 * software distributed under the License is distributed on an
 * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
 * KIND, either express or implied.  See the License for the
 * specific language governing permissions and limitations
 * under the License.
 */
package org.elasticsearch.spark.rdd

This file has been truncated. show original

Is there a way to do the same thing using the a dataframe?

Keith_Massey · June 30, 2022, 2:17pm

I think that es.mapping.id is what you want. Take a look at Apache Spark support | Elasticsearch for Apache Hadoop [master] | Elastic and Configuration | Elasticsearch for Apache Hadoop [8.3] | Elastic.

system · July 28, 2022, 2:18pm

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Define custom ID to a document with saveJsonToES() Elasticsearch es-hadoop	1	2670	January 16, 2018
[HADOOP] Add custom mapping configuration key Elasticsearch es-hadoop	1	395	September 4, 2020
(apache spark df).saveToES(elastic search) Elasticsearch es-hadoop	3	2027	March 26, 2017
[elasticsearch-hadoop] How to specify es.mapping.id value from inside a map? Elasticsearch es-hadoop	2	2363	January 17, 2018
Spark DataFrame -- Elastic Seach write _ID Elasticsearch es-hadoop	5	3088	April 9, 2017

Generating custom _id when exporting from hadoop/spark to ES

Related topics