Aggregating multiple values from single fields

Hi all,

I have some documents, where a given field can have one or several values (strings or numerical, not mixed though) and I wanted to know if it's possible to operate these, like count the number of values of one type

It looks something like this when I go to discovery:

Thanks in advance for any info and/or help!

Hi,

I suppose it is difficult to do that in usual query.
If performance is ok, using runtime field is a good try.

Or you may use ingest pipeline to execute the calculation while ingesting.

Hi Tomo,

Thanks for the reply.
I am stuck in the script, when I try to get the list.
I've tried .size() or .length to check the number of elements, but I always get 1 instead of the correct value.

This is what i've in general done so far.

int n_instances = 0;
def pone = +1;
def dsource = doc["json.meta.occurrences_fp.source_target_type.keyword"];
for(cur_source in dsource){
    if(cur_source == 'FUSD'){
    n_instances = pone; 
    }
}
emit(n_instances)

I've seen that ._source could also be used, but haven't managed either ...

Is this what you meant or what should I better go for?
(I've read that painless is not good for accessing nested documents, is this correct, should I better, reformat my documents ?)

As you tried, there are .length accessor for array.

.length worked fine for me. What is the difference??

PUT test_runtime_field
{
  "mappings": {
    "properties": {
      "original":{
        "type": "text",
        "fields":{
          "keyword":{
            "type": "keyword"
          }
        }
      }
    }
  }
}

POST test_runtime_field/_doc
{
  "original": ["aaa", "bb", "c"]
}

POST test_runtime_field/_doc
{
  "original": "aaa"
}

PUT test_runtime_field/_mapping
{
  "runtime":{
    "original_length":{
      "type": "long",
      "script":{
        "source": "emit(doc['original.keyword'].length)"
      }
    }
  }
}

GET test_runtime_field/_search
{
  "fields":["original_length"]
}

If you are using text field and/or turn off doc_values, you have to use _source field and check if it is array or not.

DELETE test_runtime_field

PUT test_runtime_field
{
  "mappings": {
    "properties": {
      "original":{
        "type": "text"
      }
    }
  }
}

POST test_runtime_field/_doc
{
  "original": ["aaa", "bb", "c"]
}

POST test_runtime_field/_doc
{
  "original": "aaa"
}

PUT test_runtime_field/_mapping
{
  "runtime":{
    "original_length":{
      "type": "long",
      "script":{
        "source": """
          if (params._source['original'] instanceof ArrayList){
            emit(params._source['original'].length)
          } else {emit(1)} """
      }
    }
  }
}

GET test_runtime_field/_search
{
  "fields":["original_length"]
}

I don't know why it doesn't work,
I tried with a dummy array of (apparently) the same characteristics and it worked, but with the real one, couldn't manage.
Anyway, I had to change the format of the documents, since I also needed to tabulate some of those results and with this structure it was getting over complicated for the display...
Thanks anyway for the support!