Es.mapping.id field duplication not allowing

Hi All,

I am using elasticsearch-spark-13_2.10 artifact for storing documents in ES. I have "INITIATIVE_CODE" field in my document which is used as _id field using below setting,

JavaEsSpark.saveToEs(final_df, "rd_innovation_item/innovation_item",
ImmutableMap.of("es.mapping.id","INITIATIVE_CODE"));

From the input the initiative_code has two documents,

US      2987621 31222335        27/07/2016 00:00:00     N       06/02/2017 01:58:54     07/02/2017 08:14:37
US      2987621 29424184        28/07/2016 00:00:00     N       06/02/2017 01:59:14     07/02/2017 08:14:37

After the completion of the job I found only one record. Is there way to index these two documents in the ES?

Please kindly share your thoughts on this.

Many Thanks,
Ganeshbabu R

Since you have set the INITIATIVE_CODE field as the id, any following documents that have the same value in that field will be indexed over the previous document. You will need to find a unique field to be used as your document id if you want to index both of these documents as separate documents.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.