I have a number of time series, each indexed by a numeric field sampleindex and each series has a unique identifier. Each entry in a time series is a sequence of words.
I now need for each time series to count for each sampleindex the number of matches to its words sequence with an earlier sampleindex.I require two or more out of sixteen words to match.
Each time series is over a million samples long. What is the best way to organise the data on ElasticSearch and efficiently evaluate this count for each sample.