Using cosineSimilarity function inside aggregation scripts

I am trying to use cosineSimilarity function inside in aggregation script but getting

Unknown call [cosine Similarity] with [2] arguments

{
    "query": {
        "match_all": {}
    },
    "aggs": {
        "group": {
            "terms": {
                "field": "group"
            },
            "aggs": {
                "profit": {
                    "scripted_metric": {
                        "params": {
                             "que_vec": [1,2,3], "ans_vec": [3,4,5]
                        },
                        "init_script": "state.scores = [:]",
                        "map_script": "double score = 0; if (doc.type.value == 'question') { score = cosineSimilarity(params.que_vec, doc.vector) } else { score = cosineSimilarity(params.ans_vec, doc.vector) } if (score > state.scores.getOrDefault(doc.type.value, 0)) { state.scores[doc.type.value] = score }",
                        "combine_script": "return state.scores",
                        "reduce_script": "return states"
                    }
                }
            }
        }
    }
}
1 Like

@lonewolf cosineSimilarity function is not available in this context. It is only available only in score context. Different painless contexts have different functions available for them.

Can you please describe your use case? Why do you need cosineSimilarity to calculate aggregation results?

I wish to compare a given query (which is a set of vectors) to all documents (vectors) in the index. The documents in the index will aggregated into buckets and then for each bucket, pairwise cosine similarity carried between vectors in bucket and vectors in query. Finally, highest cosine score and corresponding pair will be returned.

If I carry out the cosine similarity calculation in score script, the result will be available as "_score" field in aggregation and this would allow me to calculate highest cosine score for a bucket. But it would not be possible to find out which pair of vectors contributed to highest score.

2 Likes

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.