I'm doing some tests with ELK (Latest version), here is my use case:
I'm indexing some documents containing phone activity, where I use these fields:
SYSTEMA.EndUserName
SYSTEMA.EndUserPhoneNumber
SYSTEMA.CounterpartPhoneNumber
SYSTEMA.Call Direction
SYSTEMA.Date
On other side, I'm also indexing similar documents
SYSTEMB.EndUserName
SYSTEMB.EndUserPhoneNumber
SYSTEMB.CounterpartPhoneNumber
SYSTEMB.Call Direction
SYSTEMB.Date
My target is to search for any SYSTEMA document if a similar SYSTEMB document exists (With same EndUserName, EndUserPhoneNumber, CounterpartPhoneNumber, CallDirection) and approximatively same date (With few second of difference).
I thought about creating a scripted field for SYSTEMA checking the existence of a similar SYSTEMB document, what do you think?
Why not have the fields all be on the parent level and add an extra called "system" and then the value is "a" or "b"? Then just have a cardinality aggregation sorted by lowest to highest. That way, it will collect unique values into buckets and then display the buckets that have the most duplicate documents for those fields.
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.