So, I am trying to get a range of messages using timestamps from 2 different data source and compare them see if they match (say match on particular timestamp & messageid combination)
Should I do comparison on multiple message using the following?
they are 2 different indices
Or
create 2 different docs (doc_a and doc_b) under the same index?
And how to do comparison between messages (timeA-timeB) from source 1 and source 2. I look into mlt, it doesn't seem to be for comparing 2 arrays of messages.
I would like to compare 2 fields from the docs. It seems aggregations would be a good option. For example,
Each message has field seqnumber and field timestamp. Each doc has multiple messages.
I put both doc_a and doc_b stuffs in index.
Using aggs, term field "seqnumber" , it should put seqnumber as bucket key and doc_count.
I tried, and not able to get the doc count as 2 when I put 2 identical set of message in the two docs??
field : seqnumber should show the total of doc_a.seqnumber and doc_b.seqnumber
But my result buckets did not have the correct number of total.
Another question is I don't know how to get a combination of seqnumber field and timestamp field as aggs buckets.
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.