Clustering data on Elasticsearch index

Identifying and Filtering Near-Duplicate Documents | SpringerLink this is one of the algorithms that I believe would be made easier by having fingerprinting support.

Isabel