Field _size on data stream

Cristina_Marletta_Li · February 9, 2026, 5:15pm

Hi,I'd like to set the _size field on a data stream associated with an integration. How can I do this? The data stream has a custom index template associated with it:

GET _index_template/logs-thales_udp.log-custom

{ "index_templates": [ { "name": "logs-thales_udp.log-custom", "index_template": { "index_patterns": [ "logs-thales_udp.log-*" ], "template": { "settings": { "index": { "lifecycle": { "name": "thales-ilmpolicy-custom" } } }, "mappings": { "_meta": { "package": { "name": "udp" }, "managed_by": "fleet", "managed": false } } }, "composed_of": [ "logs@mappings", "logs@settings", "logs-thales_udp.log@package", "logs@custom", "logs-thales_udp.log@custom", "ecs@mappings", ".fleet_globals-1", ".fleet_agent_id_verification-1" ], "priority": 501, "_meta": { "package": { "name": "udp" }, "managed_by": "fleet", "managed": false }, "data_stream": { "hidden": false, "allow_custom_routing": false }, "ignore_missing_component_templates": [ "logs@custom", "logs-thales_udp.log@custom" ] } } ]}

I installed the mapper_size plugin

If I try to set the _size field with:

PUT _index_template/logs-thales_udp.log-custom{ "mappings": { "_size": { "enabled": true } }}

I get an error:

{ "error": { "root_cause": [ { "type": "x_content_parse_exception", "reason": "[2:3] [index_template] unknown field [mappings]" } ], "type": "x_content_parse_exception", "reason": "[2:3] [index_template] unknown field [mappings]" }, "status": 400}

Tortoise · February 10, 2026, 11:36am

Hello @Cristina_Marletta_Li

If we use this in template block with index patterns it does not give error :

PUT _index_template/logs-thales_udp.log-custom
{
  "index_patterns": ["logs-thales_udp.log-*"],
  "template": {
    "mappings": {
      "_size": {
        "enabled": true
      }
    }
  }
}

Thanks!!

Cristina_Marletta_Li · February 10, 2026, 3:02pm

Thank you Tortoise.

I've now successfully modified the index template (using the composable template ...@custom). I installed the mapper-size plugin on the master, hot, and cold nodes. I rolled over, but I don't see the _size field on the new indexes.

Tortoise · February 11, 2026, 8:19am

Hello @Cristina_Marletta_Li

Could you please check the index mapping for new index generated post rollover ?

GET <new-index/datastream-name>/_mapping

If not check the mapping for latest index ?

GET logs-thales_udp.log/_mapping

Thanks!!

Cristina_Marletta_Li · February 11, 2026, 9:29am

Hello @Tortoise ,

In the mapping section, I can see
"_size":
{
"enabled": true
}

I found the value of _size in an event just by looking at its JSON format. It's not in the Table representation of the event. Strange!

Tortoise · February 11, 2026, 10:29am

Hello @Cristina_Marletta_Li

if you want to view it in Table need to add as part of below :

Related documentation :

Thanks!!

Cristina_Marletta_Li · February 11, 2026, 3:35pm

Thank you @Tortoise,

now there's only one thing missing.

If I query the index via API (e.g.,
GET logs-thales_udp.log-default/_search
{
"query": {
"match": {
"_id": "xxxxxx"
}
}
}

I don't see the _size field in the response.

Now I see the _size field in the metadata though.

stephenb · February 11, 2026, 11:09pm

Try

GET logs-thales_udp.log-default/_search
{
  "fields : ["*"],
  "query": {
    "match": {
     "_id": "xxxxxx"
    }
  }
}

Cristina_Marletta_Li · February 16, 2026, 1:37pm

All ok! I have the _size field!

Now I wonder: what meaning should I give to the _size field since I am interested in the storage occupation of a set of events?

I thought that _size was the number of bytes the event _source is made up of in its json format but that is not the case.

Can you help me understand this field?

Thank you

stephenb · February 16, 2026, 3:30pm

From the docs here

The mapper-size plugin provides the _size metadata field which, when enabled, indexes the size in bytes of the original _source field.

Which is not the entire size/number of bytes used for the entire document when indexed. (Inverted index, doc values, synthetic source etc) The size on disk cannot be calculated before the document is written, because the size is not known... therefore there is no way to store that with the field... because documents are immutable!

So I'm curious now. What are you trying to figure out?

Are you looking for the average size per doc?
That is simple count / primary size on disk.

Do you want to know what fields are taking up the most space?

Use the _disk_usage API

And of course, typically the size on disk is reduced after merging happens, so avg doc size usually shrinks after merging as compared to when it was first written

RainTown · February 16, 2026, 4:09pm

In fairness, thats not what the doc says. It says it's the "the size in bytes of the original _source field".

@Cristina_Marletta_Li

A quick test showed a (empty) doc with no fields/values, has _size == 3, which is maybe {} plus a null? Add a keyword field {"x":""}, then _size jumps to 9. Add spaces around the : and _size is now 11. Set {"x":"1234567890"} and its 19. So seems to check out to me, as I'm not going to quibble about a byte here or there!?

stephenb · February 16, 2026, 5:23pm

Total agreement, not sure where we are misaligned

The main takeaway is that _size doesn't represent the total storage needed to index the document to disk. It only reflects the size of the _source field's JSON before indexing, pretty sure uncompressed, as you demonstrated.

The actual size on disk (for non-Synthetic Source) is typically the compressed size of the _source field plus the data structures necessary for indexing based on the mappings. Therefore, _size and the actual storage required on disk for a document are not the same.

I just wanted to clarify this. If we aren't focused on the actual size on disk, then perhaps we're all set.

With Synthetic Source, the _source field isn't stored on disk at all, which creates a much more significant difference.

The only reliable way to see the actual storage on disk is by using the _disk_usage API I mentioned earlier.

Historically, _size was used to understand "the size of what I'm indexing" and was often confused with the size on disk. However, it was a helpful indicator for identifying which documents were "expensive" or "heavy."

So, getting back to my earlier question: what are we trying to solve here?

RainTown · February 16, 2026, 5:45pm

AFAIK we're not, which is good !!

I was just trying to understand the very narrow point the OP had made:

Plus or minus the odd byte, for at least the small samples I used to validate, the _size field seems to me to match what it said it would be in the docs. @Cristina_Marletta_Li might wish to share her experience / evidence to the contrary ?

@Cristina_Marletta_Li can answer that for herself. But sometimes people just want to understand something at quite a low level. Or indeed verify "actual" matches "documented". It might even be "asking the wrong question", but ... no harm to ask, and right now I dont know the "what are we trying to solve" Q either!

Cristina_Marletta_Li · February 17, 2026, 10:41am

Hi @stephenb ,

For design reasons, I'm interested in calculating the amount of storage occupied by a set of events in a set of indexes where a field (for example, project) is x. The indexes are not dedicated to project x, so I can't simply read the size of the indexes in Index Management.

stephenb · February 17, 2026, 7:23pm

Hi @Cristina_Marletta_Li

In short, there is no easy "Exact" way to calculate the size of individual documents, nor a subset of documents within an index.

There are perhaps some easier ways to get close enough

Get the Total pri store size of the Index
Calculate the Ratio of Project X Documents to Total
Use Proportional count and size as guidance

If the Size of the Project X and Non-Project X documents are vastly different, this is not a great approach, but if they are not too different, then this can work... You could look at the _size and the mapping to perhaps guide this approach to be better

Or if you need exact, you will need to do more work,
Example

Reindex a represetitive set of Project-X documents into a dedicated index which the correct mappings setting etc....etc...
force merge to 1 segment
Then measure total docs / primary store size

How you approach will depend on the accuracy you require

Cristina_Marletta_Li · February 18, 2026, 10:18am

Thank you @stephenb

Topic		Replies	Views
[SOLVED] - Enabling _size Elasticsearch	6	2739	July 5, 2017
_size pluging not creating field 6.2.4 Elasticsearch	7	508	June 13, 2018
Mapper-Size Plugin Breaks Parsing Elasticsearch	10	85	February 6, 2025
How to specify mapping with both dynamic template and _size enabled Elasticsearch	3	776	June 5, 2020
Determining Size of A Document Elasticsearch	6	8785	May 10, 2019

Field _size on data stream

Related topics