Ngrams or dense vectors for similarity between arrays?

I have an index field "names" holding an array of place names. My goal is an ES query that sends an array of place names and returns the most similar docs based on all the names in both the query array and the index arrays. If there is any exact match between names, that doc should score highest. As I understand the ngram tokenizer, it produces a bag of tokens, which would not rank candidate matches properly in many cases - e.g. two names are quite similar to an index doc, but 4 others in the array are not. So I think the key is somehow merging name-to-names comparisons. Any ideas for how to approach this welcome - are pre-computed dense vectors a possible approach?

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.