FScrawler - Duplicate Files

Bowriverstudio · August 7, 2019, 8:32pm

Hello All,

I am a newbie to elastic search.

I am interested in using FScrawler to index all my files, then Elasticsearch find duplicates.

Or would something like this be a better approach.

Or is there a better way?

Cheers,
Maurice

dadoonet · August 10, 2019, 4:03pm

Hey Maurice

It could be easier not to index duplicated files in the first place.

Would that be an option?

Otherwise, I think the last link you shared is the good way.

Bowriverstudio · August 12, 2019, 4:03pm

Unfortunately I do not control the source of the files being indexed. Thank you for the response.

system · September 9, 2019, 4:03pm

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Fscrawler create duplicates Elasticsearch	2	294	July 9, 2020
Fscrawler to find line number of an indexed file Elasticsearch	1	497	July 12, 2018
Logstash don't detect duplicated documents Logstash	2	278	July 3, 2018
Removing Duplicate documents in ElasticSearch Elasticsearch	2	362	June 11, 2019
ElasticSearch - fscrawler missing documents in Index Elasticsearch	8	2991	October 30, 2017