Hi @leandrojmp
Thanks for your reply.
I'm not manually sending documents to Elasticsearch, I'm using Serilog with the Elasticsearch sink in a .NET application, configured like this:
Log.Logger = new LoggerConfiguration()
    .MinimumLevel.Information
    .WriteTo.Elasticsearch(new ElasticsearchSinkOptions(new Uri("http://localhost:9200"))
    {
        DataStream = new DataStreamName("logs", "app", "myservice"),
        IlmPolicy = "logs-app-ilm-policy",
        MinimumLevel = LogEventLevel.Information
    })
    .Enrich.FromLogContext()
    .CreateLogger();
Here’s an example log statement I use in code:
using (LogContext.PushProperty("userId", userId))
{
    Log.Information("User with role {userRole} performed an action", userRole);
}
This ends up being indexed in Elasticsearch like this (simplified for privacy, with some values replaced):
{
  "_index": ".ds-logs-app-myservice-2025.05.15-000001",
  "_id": "random-id",
  "_version": 1,
  "_score": 0,
  "_source": {
     "agent": {
      "type": "Elastic.CommonSchema.Serilog",
      "version": "8.11.0"
    },
    "process": {
      "pid": 4567,
      "name": "MyApp",
      "title": "",
      "thread.id": 5
    },
    "metadata": {
      "userId": 1234
    },
   "log": {
      "logger": "Elastic.CommonSchema.Serilog"
    },
    "message": "User with role consumer performed an action", 
    "labels": {
      "MessageTemplate": "User with role {userRole} performed an action",
      "userRole": "consumer"
    },
     "@timestamp": "2025-05-15T13:33:51.573Z",
    ecs.version:"8.11.0"
    "service": {
      "name": "Elastic.CommonSchema.Serilog",
      "type": "dotnet",
      "version": "8.11.0"
    },
    "host": {
      "hostname": "USER-PC",
      "ip": ["0.0.0.0"],
      "os": {
        "platform": "Windows",
        "version": "0.0.0"
      },
    "log.level": "Information",
    "event": {
      "created": "2025-05-15T13:33:51.573Z",
      "severity": 2,
    },
    "user": {
      "id": "1234",
      "name": "user",
      "domain": "MYDOMAIN"
  },
  "fields": {
    "labels.userRole": ["consumer"],
    "metadata.userId": [1234],
    "log.level": ["Information"],
    "user.id": ["1234"],
    "user.name": ["user"],
    "host.hostname": ["USER-PC"],
    "host.os.platform": ["Windows"],
    "host.os.version": ["0.0.0"],
    "process.pid": [4567],
    "process.name": ["MyApp"],
    "process.title": [""],
    "process.thread.id": [5],
    "event.severity": [2],
    "event.created": ["2025-05-15T13:33:51.573Z"],
    "agent.type": ["Elastic.CommonSchema.Serilog"],
    "agent.version": ["8.11.0"],
    "service.name": ["Elastic.CommonSchema.Serilog"],
    "service.type": ["dotnet"],
    "service.version": ["8.11.0"],
    "@timestamp": ["2025-05-15T13:33:51.573Z"]
  }
}
So a single structured log line results in a fairly large document, with many fields being auto-populated by the Serilog ECS formatter (e.g., agent, process, event, host, etc.). Also the custom fields from the structured log line( e.g userRole) is also being indexed)
My questions:
- I want to keep using structured logging (e.g. {userRole}), so fields like labels.userRole are extracted and stored separately. But if I don't actually need to search by userRole, is there a cost to indexing it anyway? Can/should I prevent indexing but still keep the structure? Also what about all these autopopulated fields?
 
- I'm concerned about mapping explosion, since I have many types of logs with different properties. Is there a way to: Limit indexing to certain fields only? From how many fields should I be concerned indexing them in the structured logging formation
 
- Are there best practices or guidelines when using logging frameworks (like Serilog) with ECS and data streams to:
 
- Avoid unnecessary mappings
 
- Reduce storage usage
 
- Still allow meaningful search and dashboards?
 
I want to keep logs structured and usable in Kibana, but avoid bloating the index or hitting limits on mapping fields.
Thanks again for your help!