Hi there!
I noticed that support for sparse vectors is being deprecated due to lack of interest and use cases: https://github.com/elastic/elasticsearch/pull/48781.
This is a great article that discusses the trade off between sparse and dense vectors as a balance of recall (how much should we pull back for a given query) and precision (how close must a term match be to return it).
My use case is that I want to be perfectly precise with term queries. Given documents:
{
"fieldA": {"one": 3, "two": 10, "five": 70},
"title": "some document"
},
{
"fieldA": {"three": 8, "four": 9, "ten": 33},
"title": "some other document"
}
I want the term frequency of each term in fieldA
to equal the number in the sparse vector.
Currently I'm repeating the term the number of times the count, but that's not very scalable once the numbers get large.
I'm not quite sure how I'd query this since sparse vector queries are sparse vectors... is it possible to query a sparse vector field with a dense vector query?
Is this possible some other way?