Most efficient way of accessing long[][] data in Painless scripts


We have a custom solution for calculating similarities using feature vectors. Each document in Elasticsearch index can have multiple entities. Thus, for each document we have basically long[][] formatted data we would like to use to calculate distances. Such as this:


How should we define the mappings in order to access this data in Painless scripts in a performant way:

long[][] vectors = doc['vectors'].value;

Thanks for all the tips! :pray:

FYI: I found Access fields in a document with the field API | Elasticsearch Guide [8.8] | Elastic which talks about accessing binary format, so I tried that with a mapping:

"viewVectors" : {
  "properties" : {
    "id_01_00" : {
      "type" : "binary",
      "store" : true,
      "doc_values" : true

And with that I seem to be able to get access to BytesRef... However, I'd like to use Java's ByteArrayInputStream & ObjectInputStream to transform it into List<List<Long>>, but apparently the Stream classes from* are not available in Painless. :pensive:

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.