Hello
I'm new to ElasticSearch. The project I'm going to show you here is a study project. I could use your help to deduplicate records that have been registered 10 times during half a day.
I tried different things that don't work (a python script, an ElasticSearch query, ...)
Here is a visual, which will be more telling, of my problem:
I'd probably instead reindex the whole dataset in a new index and I'd use the hash as the document id if possible.
You can use the reindex API to read the data from the source index and send documents to the destination index.
Adding an ingest pipeline to modify the id of the document would be useful.
{
"error" : {
"root_cause" : [
{
"type" : "illegal_argument_exception",
"reason" : "Rejecting mapping update to [blockchain] as the final mapping would have more than 1 type: [_doc, _reindex]"
}
],
"type" : "illegal_argument_exception",
"reason" : "Rejecting mapping update to [blockchain] as the final mapping would have more than 1 type: [_doc, _reindex]"
},
"status" : 400
}
This error you faced from the command line is because you have the index name in the request. /blockchain/_reindex
It should just be the /_reindex api.
Please don't post images of text as they are hard to read, may not display correctly for everyone, and are not searchable.
Instead, paste the text and format it with </> icon or pairs of triple backticks (```), and check the preview window to make sure it's properly formatted before posting it. This makes it more likely that your question will receive a useful answer.
It would be great if you could update your post to solve this.
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.