How to build a bulk request to post multiple values on a dense_vector field

I have a document describing a video in which there are multiple frames/images.
The use case is to find the video containing frames/images similar to a given image.
An image embedding is computed on each Frame/image and stored in a dense_vector field named Frame_vector.
To index this with a bulk request, I have to store multiple values in a single dense_vector field

To do this, logically I should post a JSONArray of JSONArray / Jagged array / array of array to the dense_vector field Frame_vector.

Here is my Bulk request:

2024/04/24 16:46:32 347-qesjsp-null-DEBUG-QESHost_elastic81-bulkProcess()	: bulkJsonRequest::{"index":{"_index":"mediasearch","_id":"uk/Віталіи_Кім_про_виклики_у_перші_дні_вторгнення_Миколаівськии_Ваньок_та_мотивацію_Хоробрі_серця.xml"}}
"Frame":["uk/Віталіи_Кім_про_виклики_у_перші_дні_вторгнення_Миколаівськии_Ваньок_та_мотивацію_Хоробрі_серця_Frame_0.jpg",
"uk/Віталіи_Кім_про_виклики_у_перші_дні_вторгнення_Миколаівськии_Ваньок_та_мотивацію_Хоробрі_серця_Frame_1.jpg",
"uk/Віталіи_Кім_про_виклики_у_перші_дні_вторгнення_Миколаівськии_Ваньок_та_мотивацію_Хоробрі_серця_Frame_2.jpg",
"uk/Віталіи_Кім_про_виклики_у_перші_дні_вторгнення_Миколаівськии_Ваньок_та_мотивацію_Хоробрі_серця_Frame_3.jpg",
"uk/Віталіи_Кім_про_виклики_у_перші_дні_вторгнення_Миколаівськии_Ваньок_та_мотивацію_Хоробрі_серця_Frame_4.jpg",
"uk/Віталіи_Кім_про_виклики_у_перші_дні_вторгнення_Миколаівськии_Ваньок_та_мотивацію_Хоробрі_серця_Frame_5.jpg",
"uk/Віталіи_Кім_про_виклики_у_перші_дні_вторгнення_Миколаівськии_Ваньок_та_мотивацію_Хоробрі_серця_Frame_6.jpg",
"uk/Віталіи_Кім_про_виклики_у_перші_дні_вторгнення_Миколаівськии_Ваньок_та_мотивацію_Хоробрі_серця_Frame_7.jpg",
"uk/Віталіи_Кім_про_виклики_у_перші_дні_вторгнення_Миколаівськии_Ваньок_та_мотивацію_Хоробрі_серця_Frame_8.jpg",
"uk/Віталіи_Кім_про_виклики_у_перші_дні_вторгнення_Миколаівськии_Ваньок_та_мотивацію_Хоробрі_серця_Frame_9.jpg"],
"Frame_desc":["a woman in a black shirt and a white tank top",
"a man is looking at a computer screen",
"a man in a blue shirt is smiling",
"a man standing next to a man with a microphone",
"a man in a blue shirt holding a blue and white object",
"a man in a suit and tie standing in front of a microphone",
"a man standing next to a giant clock",
"a man holding a game controller in his hand",
"a man standing next to a large fish in a mirror",
"two men in military uniforms sitting on a couch"],
"Frame_vector":[
[-0.3481926918029785,-0.13305401802062988,.........,0.98209947645664215,-0.9743254542350769],
[-0.4481926918029785,-0.23305401802062988,.........,0.88209947645664215,-0.3743254542350769],
[-0.5481926918029785,-0.33305401802062988,.........,0.78209947645664215,-0.3743254542350769],
[-0.6481926918029785,-0.43305401802062988,.........,0.68209947645664215,-0.5743254542350769],
[-0.7481926918029785,-0.53305401802062988,.........,0.58209947645664215,-0.3743254542350769],
[-0.8481926918029785,-0.63305401802062988,.........,0.48209947645664215,-0.0743254542350769],
[-0.9481926918029785,-0.63305401802062988,.........,0.38209947645664215,-0.3743254542350769],
[-0.0481926918029785,-0.73305401802062988,.........,0.28209947645664215,-0.3743254542350769],
[-0.1481926918029785,-0.73305401802062988,.........,0.18209947645664215,-0.6743254542350769],
[-0.2481926918029785,-0.03305401802062988,.........,0.08209947645664215,-0.3743254542350769]
]
}

Here is what I get as a response:

2024/04/24 16:46:32 872-qesjsp-null-ERROR-ElasticApiService-parseBulkResponse()	: Erreur dans le Bulk:{"took":64,"items":[{"index":{"_index":"mediasearch","_id":"uk/Віталіи_Кім_про_виклики_у_перші_дні_вторгнення_Миколаівськии_Ваньок_та_мотивацію_Хоробрі_серця.xml","error":{"reason":"[1:11814] failed to parse: Failed to parse object: expecting token of type [VALUE_NUMBER] but found [START_ARRAY]","caused_by":{"reason":"Failed to parse object: expecting token of type [VALUE_NUMBER] but found [START_ARRAY]","col":11814,"line":1,"type":"parsing_exception"},"type":"document_parsing_exception"},"status":400}}],"errors":true}

Is this even possible ?
Am I wrong with my assertion : array of array ?

Thank you for your help

You would need to use nested fields IMO. If supported as I never tested that.

What is the mapping?

Otherwise you should denormalize and index frames one by one instead. Which would probably have my preference.

How do you plan on searching for it? Similarity searching for a particular frame in the video?

As @dadoonet mentioned, nested dense vector query would be a good candidate

Docs
Search Labs article with mappings for dense_vector

Joe

A response from dadoonet the great : I feel very flattered :slight_smile: ....
.. and it is fully accurate as ever.

nested fields seems the definitive solution.

Since it has been marked as evil for performances , I never looked into it, but here, I have no choice.

With that in mind, I did a new research and found this previous issue:

(How to search Multiple dense vector fields, including nested filelds)

Everything is in it. mine is a poor duplicate.

Thank you, you both, for your valuable help.

Please close.

1 Like