Check if similare documents exists?


(ANTOINE) #1

Hi,

I'm doing some tests with ELK (Latest version), here is my use case:

I'm indexing some documents containing phone activity, where I use these fields:

  • SYSTEMA.EndUserName
  • SYSTEMA.EndUserPhoneNumber
  • SYSTEMA.CounterpartPhoneNumber
  • SYSTEMA.Call Direction
  • SYSTEMA.Date

On other side, I'm also indexing similar documents

  • SYSTEMB.EndUserName
  • SYSTEMB.EndUserPhoneNumber
  • SYSTEMB.CounterpartPhoneNumber
  • SYSTEMB.Call Direction
  • SYSTEMB.Date

My target is to search for any SYSTEMA document if a similar SYSTEMB document exists (With same EndUserName, EndUserPhoneNumber, CounterpartPhoneNumber, CallDirection) and approximatively same date (With few second of difference).

I thought about creating a scripted field for SYSTEMA checking the existence of a similar SYSTEMB document, what do you think?

Do you have another way to achieve this?

Many thanks,
Regards.


#2

Why not have the fields all be on the parent level and add an extra called "system" and then the value is "a" or "b"? Then just have a cardinality aggregation sorted by lowest to highest. That way, it will collect unique values into buckets and then display the buckets that have the most duplicate documents for those fields.


(ANTOINE) #3

Thank you for your reply.

I'm looking for the most effective way to achieve this, do you think that adding a scripted field looking for a similar document is the best way?

I'm thinking about something like a scripted field for the system A telling if a similar document for system B is existing.

What do you think?

Thanks again for your answer.


#4

I'm not sure since I'm relatively new to Elasticsearch, sorry! I think that a scripted field could work, but I don't know if there are better ways.


(system) #5

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.