Relevance Issues in Vector Search

Hello,

I am in the process of testing self-hosted solutions. I'm encountering an issue with vector search. My goal is to index pages of a website that mainly consists of a title and content. I have created two dense_vector fields for the title and content:

["mappings"]["properties"]["title_vector"] = [
"type" => "dense_vector",
"dims" => 512,
"index" => true,
];
["mappings"]["properties"]["content_vector"] = [
"type" => "dense_vector",
"dims" => 512,
"index" => true,
];

I am using the "FlaubertModel" transformer to convert the title and content into vectors. However, when using KNN for search, I am getting inconsistent results.

I'm not sure if this is due to improper use of dimensions, the model used to obtain the vectors not being suitable, or if my strategy is not sound. For example, it might not be appropriate to convert the entire content of a page into a vector, and instead, it might be better to convert a summary of a page for more consistent results.

Example of search:

[
"field" => "title_vector",
"query_vector" => $searchVector,
"k" => 5,
"num_candidates" => 10,
"boost" => 0.8
],
[
"field" => "content_vector",
"query_vector" => $searchVector,
"k" => 20,
"num_candidates" => 100,
"similarity" => 20,
"boost" => 0.5
]

thank you

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.