Aggregating multiple values from single fields

Hi all,

I have some documents, where a given field can have one or several values (strings or numerical, not mixed though) and I wanted to know if it's possible to operate these, like count the number of values of one type

It looks something like this when I go to discovery:

Thanks in advance for any info and/or help!

Hi,

I suppose it is difficult to do that in usual query.
If performance is ok, using runtime field is a good try.

Or you may use ingest pipeline to execute the calculation while ingesting.

Hi Tomo,

Thanks for the reply.
I am stuck in the script, when I try to get the list.
I've tried .size() or .length to check the number of elements, but I always get 1 instead of the correct value.

This is what i've in general done so far.

int n_instances = 0;
def pone = +1;
def dsource = doc["json.meta.occurrences_fp.source_target_type.keyword"];
for(cur_source in dsource){
    if(cur_source == 'FUSD'){
    n_instances = pone; 
    }
}
emit(n_instances)

I've seen that ._source could also be used, but haven't managed either ...

Is this what you meant or what should I better go for?
(I've read that painless is not good for accessing nested documents, is this correct, should I better, reformat my documents ?)

As you tried, there are .length accessor for array.

.length worked fine for me. What is the difference??

PUT test_runtime_field
{
  "mappings": {
    "properties": {
      "original":{
        "type": "text",
        "fields":{
          "keyword":{
            "type": "keyword"
          }
        }
      }
    }
  }
}

POST test_runtime_field/_doc
{
  "original": ["aaa", "bb", "c"]
}

POST test_runtime_field/_doc
{
  "original": "aaa"
}

PUT test_runtime_field/_mapping
{
  "runtime":{
    "original_length":{
      "type": "long",
      "script":{
        "source": "emit(doc['original.keyword'].length)"
      }
    }
  }
}

GET test_runtime_field/_search
{
  "fields":["original_length"]
}

If you are using text field and/or turn off doc_values, you have to use _source field and check if it is array or not.

DELETE test_runtime_field

PUT test_runtime_field
{
  "mappings": {
    "properties": {
      "original":{
        "type": "text"
      }
    }
  }
}

POST test_runtime_field/_doc
{
  "original": ["aaa", "bb", "c"]
}

POST test_runtime_field/_doc
{
  "original": "aaa"
}

PUT test_runtime_field/_mapping
{
  "runtime":{
    "original_length":{
      "type": "long",
      "script":{
        "source": """
          if (params._source['original'] instanceof ArrayList){
            emit(params._source['original'].length)
          } else {emit(1)} """
      }
    }
  }
}

GET test_runtime_field/_search
{
  "fields":["original_length"]
}

I don't know why it doesn't work,
I tried with a dummy array of (apparently) the same characteristics and it worked, but with the real one, couldn't manage.
Anyway, I had to change the format of the documents, since I also needed to tabulate some of those results and with this structure it was getting over complicated for the display...
Thanks anyway for the support!

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.