(R) #1

Hi team,

Would wanted to know how logstash will handle this data. I am collecting IOCs or generally called Indicators of Compromise and those are in a form of Domains or URLs. with certain bash commands I am extracting only domain part out of it and then those entries goes into one file.

While doing that it even sort and find the uniq entries and only unique entries goes into final file. Now I need to push those entries along with few others in ES database.

Hence wanted to know if logstash can handle the deduplication? Or since I am using sort command wanted to know if only new entries will get added in ES database?

Well here are the RAW entries I am getting and after sanitizing the below output I am pushing in text file;1

Final entries would look like;1

Blason R

(Kabilan) #2

in terms of deduplication, i would be curious to see the answer as well. as per the cleaning up entries, maybe try the dissect filter?

(R) #3

Well I believe elastic search stack by default does not do dedup.

(R) #4

OK- Guys this is solved by GROK filter.

(system) #5

