Multiple Field as mapping iD


#1

I am pretty new to elastic search. I am using elasticsearch-hadoop 6.2.4 version and I am reading the files from HDFS, converting to bean object and writing to elastic search. I am using Spark Structured streaming.

    StreamingQuery query = dataSet
                    .writeStream()
                    .format("org.elasticsearch.spark.sql")
                    //.outputMode(OutputMode.Append())
                    .option("checkpointLocation", "\tmp\ckpt1")
                    .option("es.nodes","abc.dev.cm.par.xy.hp")
                    .option("es.port","9200")
                    .option("es.mapping.id", "CustomerID")
                    .option("es.resource", "testIndex/testType")
                    .start();

While writing i am giving one of the field (CustomerID)in the pojo class as mapping iD. Can we give multiple fields or combination of fields as mapping ID? For example, my file contains customer id as well as order id fields. Can we combine these both fields as CustomerID+Order ID something like that?


(James Baiera) #2

If you create a new id field during your spark processing, you can reference that id field in the es.mapping.id setting. Granted, this will write out this id field to Elasticsearch, so if you do not want the field to be ignored after the id is extracted, you can include its name under es.mapping.exclude


(system) #3

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.