I have documents with duplicate data say like
Doc 1 :- { id : 1, empid : 1, name : Peter }
Doc 2 :- { id : 2, empid : 2, name : Leo}
Doc 3 :- { id : 1, empid : 1, name : Peter }
Doc 4 :- { id : 3, empid : 3, name : Denver }
I want to remove the documents having duplicate data .i.e, Doc 3 in above case. Is there any query to delete the documents like this.
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.