Add new field to index based on maths calculation from other fields in the same index

Hi,

I use elastic-agent on EKS with kubernetes integration.

One of the field such as kubernetes.volume.fs.used.pct in the index provides incorrect values

I was able to get the correct value using the following formula:

subtract(100, multiply(average(kubernetes.volume.fs.available.bytes) / max(kubernetes.volume.fs.capacity.bytes), 100))

Based on that new metric, I would like to create alert when the used pod volume is above 80%.
I have added the following processor configuration to ingest pipeline

metrics-kubernetes.volume@custom

which should add new filed to the index:

ctx['kubernetes.volume.fs.used.custom.pct'] = 100 - (ctx['kubernetes.volume.fs.available.bytes'] / ctx['kubernetes.volume.fs.capacity.bytes']) * 100

It doesn't work as the new filed is empty and upon checking now - it doesn't exist at all.

The question is - how to add new field to index based on maths calculation from other fields in the same index as required for alerting?

I assume that would have to be done later than ingest pipeline.

Hi @patcan Welcome to the community!

You need to share your actual ingest pipeline you can get it from Kibana -> Dev Tools with

GET _ingest/pipeline/metrics-kubernetes.volume@custom

That is the wrong syntax for accessing fields in an ingest pipeline ...
Please look / read the docs here there are different syntax for conditions vs accessing the field data (which is what you are doing)

Several processor parameters support Mustache template snippets. To access field values in a template snippet, enclose the field name in triple curly brackets:{{{field-name}}} . You can use template snippets to dynamically set field names.

It should look more like

"set": {
        "description": "my custom metric",
        "field": "kubernetes.volume.fs.used.custom.pct",
        "value": "100 - {{{kubernetes.volume.fs.available.bytes}}} / {{{kubernetes.volume.fs.capacity.bytes}}}) * 100"
      }

And really percent is a decimal so it really should be like

value": "1.0 - {{{kubernetes.volume.fs.available.bytes}}} / {{{kubernetes.volume.fs.capacity.bytes}}}) "

Hi @stephenb

{
  "metrics-kubernetes.volume@custom": {
    "processors": [
      {
        "script": {
          "ignore_failure": true,
          "source": "ctx['kubernetes.volume.fs.used.custom.pct'] = 100 - (ctx['kubernetes.volume.fs.available.bytes'] / ctx['kubernetes.volume.fs.capacity.bytes']) * 100"
        }
      }
    ]
  }
}

I will try your solution. Thanks

Nope. Now I have suggested Set processor and result:

{
  "metrics-kubernetes.volume@custom": {
    "processors": [
      {
        "set": {
          "field": "kubernetes.volume.fs.used.custom.pct",
          "value": "(1.0 - {{{kubernetes.volume.fs.available.bytes}}} / {{{kubernetes.volume.fs.capacity.bytes}}})",
          "ignore_failure": true
        }
      }
    ]
  }
}

Getting under the kubernetes.volume.fs.used.custom.pct field value:
(1.0 - 9786859520 / 21462233088)

It looks like it reads the values of the fields but doesn't perform the maths

Try taking out the double quotes Wrong ! :slight_smile:

@patcan

Apologies Sorry Led you astray need to use a script processor ... which makes sense after I thought about it ... so you were close to start...

Also do not put just raw numbers into the equation use params it is much more efficient.. in short it only needs to compile the script once

POST _ingest/pipeline/_simulate
{
  "pipeline": {
    "processors": [
      {
        "script": {
          "lang": "painless",
          "source": "ctx['kubernetes.volume.fs.used.custom.pct'] = (params.one - (double)(ctx['kubernetes.volume.fs.available.bytes']) / (double)(ctx['kubernetes.volume.fs.capacity.bytes'])) * params.to_percent",
          "params": {
            "to_percent": 100.0,
            "one": 1.0
          }
        }
      }
    ]
  },
  "docs": [
    {
      "_source": {
        "kubernetes.volume.fs.available.bytes": 9786859520,
        "kubernetes.volume.fs.capacity.bytes": 21462233088
      }
    }
  ]
}


# Results
{
  "docs": [
    {
      "doc": {
        "_index": "_index",
        "_id": "_id",
        "_version": "-3",
        "_source": {
          "kubernetes.volume.fs.capacity.bytes": 21462233088,
          "kubernetes.volume.fs.used.custom.pct": 54.39962151248816,
          "kubernetes.volume.fs.available.bytes": 9786859520
        },
        "_ingest": {
          "timestamp": "2023-07-06T20:03:34.48287543Z"
        }
      }
    }
  ]
}

Strange thing I face over here.

Coping and pasting the above POST works fine but adding the processor to ingest pipeline doesn't.
Getting Error:

{
  "docs": [
    {
      "processor_results": [
        {
          "processor_type": "script",
          "status": "error_ignored",
          "ignored_error": {
            "error": {
              "root_cause": [
                {
                  "type": "script_exception",
                  "reason": "runtime error",
                  "script_stack": [
                    "ctx['kubernetes.volume.fs.used.custom.pct'] = (params.one - (double)(ctx['kubernetes.volume.fs.available.bytes']) / (double)(ctx['kubernetes.volume.fs.capacity.bytes'])) * params.to_percent",
                    "                                                                        ^---- HERE"
                  ],
                  "script": "ctx['kubernetes.volume.fs.used.custom.pct'] = (params.one - (double)(ctx['kubernetes.volume.fs.available.bytes']) / (double)(ctx['kubernetes.volume.fs.capacity.bytes'])) * params.to_percent",
                  "lang": "painless",
                  "position": {
                    "offset": 72,
                    "start": 0,
                    "end": 189
                  }
                }
              ],
              "type": "script_exception",
              "reason": "runtime error",
              "script_stack": [
                "ctx['kubernetes.volume.fs.used.custom.pct'] = (params.one - (double)(ctx['kubernetes.volume.fs.available.bytes']) / (double)(ctx['kubernetes.volume.fs.capacity.bytes'])) * params.to_percent",
                "                                                                        ^---- HERE"
              ],
              "script": "ctx['kubernetes.volume.fs.used.custom.pct'] = (params.one - (double)(ctx['kubernetes.volume.fs.available.bytes']) / (double)(ctx['kubernetes.volume.fs.capacity.bytes'])) * params.to_percent",
              "lang": "painless",
              "position": {
                "offset": 72,
                "start": 0,
                "end": 189
              },
              "caused_by": {
                "type": "null_pointer_exception",
                "reason": "Cannot invoke \"Object.getClass()\" because \"value\" is null"
              }
            }
          },
          "doc": {
            "_index": ".test",
            "_version": "-3",
            "_id": "test",
            "_source": {
              "kubernetes": {
                "volume": {
                  "name": "test",
                  "fs": {
                    "available": {
                      "bytes": 262131712
                    },
                    "used": {
                      "pct": 0.000046875,
                      "bytes": 12288
                    },
                    "inodes": {
                      "pct": 0.0000044726719742374095,
                      "count": 2012220,
                      "used": 9,
                      "free": 2012211
                    },
                    "capacity": {
                      "bytes": 262144000
                    }
                  }
                }
              }
            },
            "_ingest": {
              "pipeline": "_simulate_pipeline",
              "timestamp": "2023-07-07T12:06:45.685737624Z"
            }
          }
        }
      ]
    }
  ]
}

Even adding processor to ingest pipeline as JSON import doesn't help here:

  {
    "processors": [
      {
        "script": {
          "lang": "painless",
          "source": "ctx['kubernetes.volume.fs.used.custom.pct'] = (params.one - (double)(ctx['kubernetes.volume.fs.available.bytes']) / (double)(ctx['kubernetes.volume.fs.capacity.bytes'])) * params.to_percent",
          "params": {
            "to_percent": 100.0,
            "one": 1.0
          }
        }
      }
    ]
  }

Used document to test pipeline:

[
  {
    "_index": ".test",
    "_id": "test",
    "_source": {
      "kubernetes": {
        "volume": {
          "name": "test",
          "fs": {
            "inodes": {
              "pct": 0.0000044726719742374095,
              "count": 2012220,
              "used": 9,
              "free": 2012211
            },
            "available": {
              "bytes": 262131712
            },
            "used": {
              "pct": 0.000046875,
              "bytes": 12288
            },
            "capacity": {
              "bytes": 262144000
            }
          }
        }
      }
    }
  }  
]

BTW if I check the ingest pipeline - all looks good IMO

GET _ingest/pipeline/metrics-kubernetes.volume@custom

result:

{
  "metrics-kubernetes.volume@custom": {
    "processors": [
      {
        "script": {
          "lang": "painless",
          "source": "ctx['kubernetes.volume.fs.used.custom.pct'] = (params.one - (double)(ctx['kubernetes.volume.fs.available.bytes']) / (double)(ctx['kubernetes.volume.fs.capacity.bytes'])) * params.to_percent",
          "params": {
            "to_percent": 100,
            "one": 1
          },
          "ignore_failure": true
        }
      }
    ]
  }
}

Hmmm yes... I see that too --- head scratching I will take a look when I get a chance.
Some form of syntax error

@patcan Got It!!! Wrong Syntax
Remember I would use decimal params 1.0, 100.0

Not sure why the other syntax does not work but the shorthand seems to see here

Access source fields

The script processor parses each incoming document’s JSON source fields into a set of maps, lists, and primitives. To access these fields with a Painless script, use the map access operator: ctx['my-field']. You can also use the shorthand ctx.<my-field> syntax.

PUT _ingest/pipeline/metrics-kubernetes.volume@custom
{
  "processors": [
    {
      "script": {
        "lang": "painless",
        "source": "ctx.temp_calc_result = (params.one - (double)(ctx.kubernetes.volume.fs.available.bytes) / (double)(ctx.kubernetes.volume.fs.capacity.bytes)) * params.to_percent",
        "params": {
          "to_percent": 100.0,
          "one": 1.0
        },
        "ignore_failure": true
      },
      "set": {
        "field": "kubernetes.volume.fs.used.custom.pct",
        "copy_from": "temp_calc_result"
      },
      "remove": {
        "field": "temp_calc_result"
      }
    }
  ]
}


# Simulate

POST _ingest/pipeline/metrics-kubernetes.volume@custom/_simulate
{
  "docs": [
    {
      "_index": ".test",
      "_id": "test",
      "_source": {
        "kubernetes": {
          "volume": {
            "name": "test",
            "fs": {
              "available": {
                "bytes": 262131712
              },
              "capacity": {
                "bytes": 262144000
              }
            }
          }
        }
      }
    }
  ]
}

# Results

{
  "docs": [
    {
      "doc": {
        "_index": ".test",
        "_id": "test",
        "_version": "-3",
        "_source": {
          "kubernetes": {
            "volume": {
              "name": "test",
              "fs": {
                "available": {
                  "bytes": 262131712
                },
                "used": {
                  "custom": {
                    "pct": 0.004687500000000178
                  }
                },
                "capacity": {
                  "bytes": 262144000
                }
              }
            }
          }
        },
        "_ingest": {
          "timestamp": "2023-07-07T16:56:24.869550011Z"
        }
      }
    }
  ]
}

# Now Test Real 

POST test-discuss/_doc/?pipeline=metrics-kubernetes.volume@custom
{
  "kubernetes": {
    "volume": {
      "name": "test",
      "fs": {
        "available": {
          "bytes": 262131712
        },
        "capacity": {
          "bytes": 262144000
        }
      }
    }
  }
}

# Results YAY!
GET test-discuss/_search
{
  "fields": [
    "*"
  ]
}

# Results
{
  "took": 0,
  "timed_out": false,
  "_shards": {
    "total": 1,
    "successful": 1,
    "skipped": 0,
    "failed": 0
  },
  "hits": {
    "total": {
      "value": 1,
      "relation": "eq"
    },
    "max_score": 1,
    "hits": [
      {
        "_index": "test-discuss",
        "_id": "mYpJMYkBuXkTucfoSgvb",
        "_score": 1,
        "_source": {
          "kubernetes": {
            "volume": {
              "name": "test",
              "fs": {
                "available": {
                  "bytes": 262131712
                },
                "used": {
                  "custom": {
                    "pct": 0.004687500000000178
                  }
                },
                "capacity": {
                  "bytes": 262144000
                }
              }
            }
          }
        },
        "fields": {
          "kubernetes.volume.fs.capacity.bytes": [
            262144000
          ],
          "kubernetes.volume.name": [
            "test"
          ],
          "kubernetes.volume.fs.available.bytes": [
            262131712
          ],
          "kubernetes.volume.name.keyword": [
            "test"
          ],
          "kubernetes.volume.fs.used.custom.pct": [
            0.0046875
          ]
        }
      }
    ]
  }
}

Ahhhh I found it this would be the proper other syntax is a map

ctx['kubernetes']['volume']['fs']['available']['bytes']

Yeap @stephenb - that's the solution. Thank you.

I had to add another set processor in front to set the new field.
The script processor fails if the field doesn't exist,
Full working JSON looks like this:

[
  {
    "set": {
      "field": "kubernetes.volume.fs.used.custom.pct",
      "value": "0",
      "override": false
    }
  },
  {
    "script": {
      "source": "ctx['kubernetes']['volume']['fs']['used']['custom']['pct'] = (params.one - (double)(ctx['kubernetes']['volume']['fs']['available']['bytes']) / (double)(ctx['kubernetes']['volume']['fs']['capacity']['bytes'])) * params.to_percent",
      "params": {
        "to_percent": 100,
        "one": 1
      },
      "ignore_failure": true
    }
  }
]

I just wonder how to round the result to 1 decimal, so let say max 10.5%

Should I use convert processor to integer or is there a better solution?

Have it - solution is:

[
  {
    "set": {
      "field": "kubernetes.volume.fs.used.custom.pct",
      "value": "0",
      "override": false
    }
  },
  {
    "script": {
      "source": "ctx['kubernetes']['volume']['fs']['used']['custom']['pct'] = Math.round((params.one - (double)(ctx['kubernetes']['volume']['fs']['available']['bytes']) / (double)(ctx['kubernetes']['volume']['fs']['capacity']['bytes'])) * params.to_percent * 10.0) / 10.0",
      "params": {
        "to_percent": 100,
        "one": 1
      },
      "ignore_failure": true
    }
  }
]

Math.round

In the updated script, the result is multiplied by 10.0, then rounded using Math.round(), and finally divided by 10.0 again to get the rounded value with one decimal place.

Note that the Math.round() function rounds the number to the nearest whole number, so multiplying and dividing by 10.0 allows you to round to one decimal place.

Thanks again

1 Like

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.