Hi @leandrojmp
Thanks for your reply.
I'm not manually sending documents to Elasticsearch, I'm using Serilog with the Elasticsearch sink in a .NET application, configured like this:
Log.Logger = new LoggerConfiguration()
.MinimumLevel.Information
.WriteTo.Elasticsearch(new ElasticsearchSinkOptions(new Uri("http://localhost:9200"))
{
DataStream = new DataStreamName("logs", "app", "myservice"),
IlmPolicy = "logs-app-ilm-policy",
MinimumLevel = LogEventLevel.Information
})
.Enrich.FromLogContext()
.CreateLogger();
Here’s an example log statement I use in code:
using (LogContext.PushProperty("userId", userId))
{
Log.Information("User with role {userRole} performed an action", userRole);
}
This ends up being indexed in Elasticsearch like this (simplified for privacy, with some values replaced):
{
"_index": ".ds-logs-app-myservice-2025.05.15-000001",
"_id": "random-id",
"_version": 1,
"_score": 0,
"_source": {
"agent": {
"type": "Elastic.CommonSchema.Serilog",
"version": "8.11.0"
},
"process": {
"pid": 4567,
"name": "MyApp",
"title": "",
"thread.id": 5
},
"metadata": {
"userId": 1234
},
"log": {
"logger": "Elastic.CommonSchema.Serilog"
},
"message": "User with role consumer performed an action",
"labels": {
"MessageTemplate": "User with role {userRole} performed an action",
"userRole": "consumer"
},
"@timestamp": "2025-05-15T13:33:51.573Z",
ecs.version:"8.11.0"
"service": {
"name": "Elastic.CommonSchema.Serilog",
"type": "dotnet",
"version": "8.11.0"
},
"host": {
"hostname": "USER-PC",
"ip": ["0.0.0.0"],
"os": {
"platform": "Windows",
"version": "0.0.0"
},
"log.level": "Information",
"event": {
"created": "2025-05-15T13:33:51.573Z",
"severity": 2,
},
"user": {
"id": "1234",
"name": "user",
"domain": "MYDOMAIN"
},
"fields": {
"labels.userRole": ["consumer"],
"metadata.userId": [1234],
"log.level": ["Information"],
"user.id": ["1234"],
"user.name": ["user"],
"host.hostname": ["USER-PC"],
"host.os.platform": ["Windows"],
"host.os.version": ["0.0.0"],
"process.pid": [4567],
"process.name": ["MyApp"],
"process.title": [""],
"process.thread.id": [5],
"event.severity": [2],
"event.created": ["2025-05-15T13:33:51.573Z"],
"agent.type": ["Elastic.CommonSchema.Serilog"],
"agent.version": ["8.11.0"],
"service.name": ["Elastic.CommonSchema.Serilog"],
"service.type": ["dotnet"],
"service.version": ["8.11.0"],
"@timestamp": ["2025-05-15T13:33:51.573Z"]
}
}
So a single structured log line results in a fairly large document, with many fields being auto-populated by the Serilog ECS formatter (e.g., agent, process, event, host, etc.). Also the custom fields from the structured log line( e.g userRole) is also being indexed)
My questions:
- I want to keep using structured logging (e.g. {userRole}), so fields like labels.userRole are extracted and stored separately. But if I don't actually need to search by userRole, is there a cost to indexing it anyway? Can/should I prevent indexing but still keep the structure? Also what about all these autopopulated fields?
- I'm concerned about mapping explosion, since I have many types of logs with different properties. Is there a way to: Limit indexing to certain fields only? From how many fields should I be concerned indexing them in the structured logging formation
- Are there best practices or guidelines when using logging frameworks (like Serilog) with ECS and data streams to:
- Avoid unnecessary mappings
- Reduce storage usage
- Still allow meaningful search and dashboards?
I want to keep logs structured and usable in Kibana, but avoid bloating the index or hitting limits on mapping fields.
Thanks again for your help!