Hi,
actually, into different indexes, I have duplicated records which have the same field's values but, obviously, different _id value.
so, trying to remove these duplicated records this is what I would like to do:
query a specific index
compare each found record with the others located inside index above
if myfield record's value1 is equal to myfield record's value2 drop record1
essentially, writing this procedure in a generic scripting language it will looks like:
foreach element into index.2017.06.6 {
do {
if element.myvalue equal element[+1].myvalue{
delete element
}
}
}
can someone help me? I don't want the code to do that, I would like to have some guidelines to understand the mechanism behind the execution of scripts and how can I pass parameters inside my search.
You won't be able to do this within a search script, since the script can only decide if a document matches the search, not try to remove a document at the same time. I think you will need to do this from your client. I don't think you will need an elasticsearch script though. It should just be a normal query, look at the results with the pseudocode you already have above, and if you find the condition you are looking for, issue a delete request for that document.
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.