Hello,
I am in the process of testing self-hosted solutions. I'm encountering an issue with vector search. My goal is to index pages of a website that mainly consists of a title and content. I have created two dense_vector fields for the title and content:
["mappings"]["properties"]["title_vector"] = [
"type" => "dense_vector",
"dims" => 512,
"index" => true,
];
["mappings"]["properties"]["content_vector"] = [
"type" => "dense_vector",
"dims" => 512,
"index" => true,
];
I am using the "FlaubertModel" transformer to convert the title and content into vectors. However, when using KNN for search, I am getting inconsistent results.
I'm not sure if this is due to improper use of dimensions, the model used to obtain the vectors not being suitable, or if my strategy is not sound. For example, it might not be appropriate to convert the entire content of a page into a vector, and instead, it might be better to convert a summary of a page for more consistent results.
Example of search:
[
"field" => "title_vector",
"query_vector" => $searchVector,
"k" => 5,
"num_candidates" => 10,
"boost" => 0.8
],
[
"field" => "content_vector",
"query_vector" => $searchVector,
"k" => 20,
"num_candidates" => 100,
"similarity" => 20,
"boost" => 0.5
]
thank you