How to map memory size

mg77345 · August 2, 2024, 10:44pm

I have fields that are currently being ingested as

resources_used.vmem: 1028974kb
resources_requested.vmem: 2000000kb

They are mapped as keyword, but i would like to have the "kb" (or file size denom.) stripped and to store them as a Numeric so i can query. What would be the best way to do this?

dadoonet · August 3, 2024, 9:59am

Use an ingest pipeline with this processor: Bytes processor | Elasticsearch Guide [8.14] | Elastic

mg77345 · March 7, 2025, 9:24pm

Reviving this topic - i may be misunderstanding how the Bytes processor works. It says:

Converts a human readable byte value (e.g. 1kb) to its value in bytes (e.g. 1024). If the field is an array of strings, all members of the array will be converted.
Supported human readable units are "b", "kb", "mb", "gb", "tb", "pb" case insensitive. An error will occur if the field is not a supported format or resultant value exceeds 2^63.

When running the bytes processor on my field, it seems to have no affect. This is the input on test document:

...,
"resources_used.mem": "3528kb",
...

This is the processor:

This is the output

"resources_used.mem": "3528kb",

The input type is a string. I can't find any other info on the processor, what the input type needs to be, or what it outputs as.

For reference this is Elastic 8.16.1

stephenb · March 8, 2025, 2:30pm

Go to Dev Tools to Debug

POST _ingest/pipeline/_simulate
{
  "pipeline": {
    "processors": [
      {
        "bytes": {
          "field": "resources_used.mem",
          "ignore_failure": true
        }
      }
    ]
  },
  "docs": [
    {
      "_source": {
        "resources_used.mem": "3528kb"
        
      }
    },
    {
      "_source": {
        "resources_used": {
          "mem": "3528kb"
        }
      }
    }
  ]
}

Note the first fails and the second works... what does the actual source document look like ... Ingest pipeline does not work on dotted fields

mg77345 · March 10, 2025, 8:46pm

That makes sense. I do some scripting in the ingest pipeline that makes the resources_used.mem end up that way. That field in the source is "message". I parse through that field in the context using the script below, splitting the string on " " and then "=", appending values to the context (In this case the source would be "resources_used.mem=3528kb").

I presume the formatting error stems from this... but my painless skill is quite weak.:

if (ctx['message'].empty){
  String donothing = "";
}else{
  String[] messSplit = ctx['message'].splitOnToken(' ');
  int i = 0;
  for (item in messSplit){
    i = i+1;
    if (item.contains("Resource_List.select")){
      String[] splitItem = /=/.split(item,2);
      String label = splitItem[0];
      String data = splitItem[1];
      ctx[label] = data;
    } else {
      String[] splitItem = item.splitOnToken("=");
      int length = splitItem.length;
      if (length <= 1){
        continue;
      }
      String label = splitItem[0];
      String data = splitItem[1];
      ctx[label] = data;
    }
  }
}

ctx[label]=data is what outputs "resources_used.mem":"3528kb" to the field of the same name, mapped as text.

stephenb · March 11, 2025, 4:01am

Now works for both...

POST _ingest/pipeline/_simulate
{
  "pipeline": {
    "processors": [
      {
        "dot_expander": {
          "field": "resources_used.mem"
        }
      },
      {
        "bytes": {
          "field": "resources_used.mem",
          "ignore_failure": true
        }
      }
    ]
  },
  "docs": [
    {
      "_source": {
        "resources_used.mem": "3528kb"
      }
    },
    {
      "_source": {
        "resources_used": {
          "mem": "3528kb"
        }
      }
    }
  ]
}

mg77345 · March 11, 2025, 6:01pm

That worked. Thanks! Nice to have a processor made for this exact purpose .

Topic		Replies	Views
Convert strings with different data units (MB,GB,TB) to byte Elasticsearch	7	5958	July 5, 2017
Transform to human readable values Elasticsearch elastic-stack-alerting	3	1714	September 13, 2017
Data Type Conversion for bytes Logstash	6	2720	July 6, 2017
Scripted fields Mb to bytes Kibana	2	1741	December 26, 2017
[SOLVED] Doing conversion byte to mega byte with logstash Logstash	4	3881	July 6, 2017

How to map memory size

Related topics