Query optimization

hadrienrsq · October 25, 2021, 9:43am

Dear community,

I am pretty brand new to Elasticsearch and am looking to perform some sort of two steps query on one index. Indeed, the items in the index have what we call a cli_ID attribute, which I would like to pre-filter the data on, and then perform an embedding similarity calculation (that could take some time) only on the items that match the first query.

Plus I would need to retrieve the scores associated with the embedding calculation exclusively.

I have implemented a solution using the post-filter option of Elasticsearch, however, it does not provide the score associated with the second query (embedding similarity calculation)

query_body = {
    "query": {"wildcard": { "client_ID" : "*"+cli_ID+"*" }},
    "post_filter": { 
        "bool": {
          "must": [
            /: query for the embedding calculation goes here. :/
          ]
        }
    }
}

This is a test using the post_filter option but definitely does not fit our use case completely.

This link indicates that the post-filter can't provide the score using a filter option.

So my question is a little open, but I would really appreciate some guidance towards whatever solution that :

Helps me prefilter the items in the index by client-ID
Then compute the embedding similarity on those items
Let me access the scores that are produced in the second query;

Thank you for your help in advance;

system · November 22, 2021, 9:43am

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.