Most efficient way of accessing long[][] data in Painless scripts

Pyppe · June 20, 2023, 6:48am

Hi!

We have a custom solution for calculating similarities using feature vectors. Each document in Elasticsearch index can have multiple entities. Thus, for each document we have basically long[][] formatted data we would like to use to calculate distances. Such as this:

[ 
 [378322298287171600,-9182346388506132000,-7884923301547995000,2954398850619687400,5792760765226170000,6941191355558596000,-9175934689997701000,2453767474651472000],
 [2395942447151390700,7206045950792974000,-6761273774897486000,648553841033347700,-4591414079501816000,3563632123683616000,288379928265751740,733693665263878500],
  ...
]

How should we define the mappings in order to access this data in Painless scripts in a performant way:

long[][] vectors = doc['vectors'].value;

Thanks for all the tips!

Pyppe · June 20, 2023, 1:06pm

FYI: I found Access fields in a document with the field API | Elasticsearch Guide [8.8] | Elastic which talks about accessing binary format, so I tried that with a mapping:

"viewVectors" : {
  "properties" : {
    "id_01_00" : {
      "type" : "binary",
      "store" : true,
      "doc_values" : true
    }
  }
}

And with that I seem to be able to get access to BytesRef... However, I'd like to use Java's ByteArrayInputStream & ObjectInputStream to transform it into List<List<Long>>, but apparently the Stream classes from java.io.* are not available in Painless.

system · July 18, 2023, 1:07pm

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Accessing array of object in painless script , search query [ES 6.4] Elasticsearch	4	1777	December 30, 2019
A few Painless questions Elasticsearch	2	504	February 8, 2018
What's the difference between accessing a document field via .value/.values or not in painless script? Elasticsearch painless	2	637	November 9, 2019
How do I access term vector in painless scripting? Elasticsearch	1	2036	June 1, 2017
What exactly does painless do under the hood of painless script's Array contains method? Elasticsearch painless	3	1860	November 5, 2019

Most efficient way of accessing long[][] data in Painless scripts

Related topics