How to identify and remove duplicates in Elasticsearch index

Maybe this blog post might be useful? I am not sure there is a way to reliably create a query to use with delete by query to handle this, so the approach described in the blog post may be safer.