Please don't deprecate sparse vector fields

George_Kinsman · February 12, 2020, 6:50pm

Hi there!

I noticed that support for sparse vectors is being deprecated due to lack of interest and use cases: https://github.com/elastic/elasticsearch/pull/48781.

This is a great article that discusses the trade off between sparse and dense vectors as a balance of recall (how much should we pull back for a given query) and precision (how close must a term match be to return it).

My use case is that I want to be perfectly precise with term queries. Given documents:

{
"fieldA": {"one": 3, "two": 10, "five": 70},
"title": "some document"
},
{
"fieldA": {"three": 8, "four": 9, "ten": 33},
"title": "some other document"
}

I want the term frequency of each term in fieldA to equal the number in the sparse vector.
Currently I'm repeating the term the number of times the count, but that's not very scalable once the numbers get large.

I'm not quite sure how I'd query this since sparse vector queries are sparse vectors... is it possible to query a sparse vector field with a dense vector query?

Is this possible some other way?

mayya · February 13, 2020, 3:29pm

Hello,
indeed, the sparse vectors have been removed from elasticsearch due to the lack of solid use cases. However, if we see interesting use cases that sparse vectors can address, we will consider to reintroduce them. But it looks like the article you referenced talks about a different type of sparse vectors – the way traditional Lucene indexes are organized. Postings lists in a Lucene index indeed represent sparse vectors.

About your use-case, can you please give a more detailed example of a type of query you want to run and examples of matching documents?
It is possible that rank_features datatype and rank_feature query can address your use case.

system · March 12, 2020, 3:41pm

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Sparse vector embeddings Elasticsearch	5	294	March 17, 2024
Sparse Vectors Elasticsearch	21	32	October 18, 2024
Increase Elasticsearch maximum dimensions for sparse vectors Elasticsearch	3	932	December 2, 2019
Sparse vector vs rank features. Which one? Elasticsearch	13	1692	March 10, 2020
How to search dense_vector Elasticsearch	2	2090	February 26, 2019

Please don't deprecate sparse vector fields

Related topics