Es.mapping.id field duplication not allowing


(ganeshbabu) #1

Hi All,

I am using elasticsearch-spark-13_2.10 artifact for storing documents in ES. I have "INITIATIVE_CODE" field in my document which is used as _id field using below setting,

JavaEsSpark.saveToEs(final_df, "rd_innovation_item/innovation_item",
ImmutableMap.of("es.mapping.id","INITIATIVE_CODE"));

From the input the initiative_code has two documents,

US      2987621 31222335        27/07/2016 00:00:00     N       06/02/2017 01:58:54     07/02/2017 08:14:37
US      2987621 29424184        28/07/2016 00:00:00     N       06/02/2017 01:59:14     07/02/2017 08:14:37

After the completion of the job I found only one record. Is there way to index these two documents in the ES?

Please kindly share your thoughts on this.

Many Thanks,
Ganeshbabu R


(James Baiera) #2

Since you have set the INITIATIVE_CODE field as the id, any following documents that have the same value in that field will be indexed over the previous document. You will need to find a unique field to be used as your document id if you want to index both of these documents as separate documents.


(system) #3

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.