Aggregation by the first character of field

Dears,

Is there any way to do doc aggregation on the first character instead of whole field?

My query in case of whole field:

GET /log-2020.07.07/_search?size=0
{
  "aggs": {
    "TEXT1": {
      "terms": {
        "field": "ci.rc.keyword",
        "size": 10
      }
    }
  }
}

and result looks like:

"aggregations" : {
    "TEXT1" : {
      "doc_count_error_upper_bound" : 0,
      "sum_other_doc_count" : 0,
      "buckets" : [
        {
          "key" : "4001",
          "doc_count" : 14
        },
        {
          "key" : "9113",
          "doc_count" : 12
        },
        {
          "key" : "9777",
          "doc_count" : 6
        },
        {
          "key" : "1010",
          "doc_count" : 4
        },
        {
          "key" : "1608",
          "doc_count" : 4
        },
        {
          "key" : "0001",
          "doc_count" : 2
        },
        {
          "key" : "1000",
          "doc_count" : 2
        }
      ]
    }
  }
}

Best Regards,
Dan

There is an option for running a script to get the value. You'd use a script like doc['ci.rc.keyword'].value.charAt(0).

That might not work properly if the character doesn't fit in basic multilingual plane, but it probably does and the script would be a bit more complex than I can compose in an email if it had to deal with that.

doesn't work:

"script" : "doc['ci.rc'].value.charAt(0)",
          "lang" : "painless",
          "caused_by" : {
            "type" : "illegal_argument_exception",
            "reason" : "Text fields are not optimised for operations that require per-document field data like aggregations and sorting, so these operations are disabled by default. Please use a keyword field instead. Alternatively, set fielddata=true on [hi.rc] in order to load field data by uninverting the inverted index. Note that this can use significant memory."
          "script" : "doc['ci.rc.keyword'].value.charAt(0)",
          "lang" : "painless",
          "caused_by" : {
            "type" : "illegal_state_exception",
            "reason" : "A document doesn't have a value for a field! Use doc[<field>].size()==0 to check if a document is missing a field!"

I think you'll have to play with the script some. Checking the size first sounds right.

Thanks @nik9000. I'll check it.

@nik9000 script works well in aggregation. Many thanks.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.