After upgrading the Elastic Stack, Fleet Server, and Elastic Agents to 9.3.4, and after re-enabling / upgrading integrations, we started seeing a very large number of errors in logs-elastic_agent-*.
The errors look like this:
message: failed to index document
error.type: version_conflict_engine_exception
http.response.status_code: 409
hint: check the "Known issues" section of Elasticsearch Exporter docs
Example target data streams / backing indices:
.ds-metrics-elasticsearch.stack_monitoring.shard-default-...
.ds-metrics-sql-default-...
Initially, we mostly observed this for Elasticsearch Stack Monitoring, especially the dataset:
metrics-elasticsearch.stack_monitoring.shard
Later, similar errors also appeared for SQL metrics.
Sample log
{
"message": "failed to index document",
"error.type": "version_conflict_engine_exception",
"http.response.status_code": 409,
"hint": "check the "Known issues" section of Elasticsearch Exporter docs",
"index": ".ds-metrics-elasticsearch.stack_monitoring.shard-default-2026.04.01-000023",
"otelcol.component.id": "elasticsearch/_agent-component/default",
"otelcol.component.kind": "exporter",
"otelcol.signal": "logs",
"resource.service.name": "/opt/Elastic/Agent/data/elastic-agent-9.3.4-.../components/elastic-otel-collector",
"resource.service.version": "9.3.4"
}
For SQL metrics, the error is similar.
The main operational impact is a very large amount of error noise in logs-elastic_agent-*.
We do not see an Elasticsearch cluster failure at the same time. The cluster is not red and primary shards are available. Metrics appear to be partially or mostly visible in Kibana, but we are not sure whether all metric samples are preserved or whether some samples are rejected as duplicates.
The Elasticsearch Exporter documentation says that version_conflict_engine_exception can be a normal sign of duplicate detection in TSDB metrics data streams, but it also states that in some cases data may be classified as duplicate even when it was not, which implies possible data loss.
What we have checked
We have verified that:
- The Elasticsearch cluster is running and is not red.
- Primary shards are available.
- Elastic Agent is Healthy.
- The integrations are enabled.
- The issue started after the 9.3.4 upgrade / after re-enabling integrations.
- Most errors are related to TSDB metrics, especially stack_monitoring.shard and SQL metrics.
- The error is emitted by elastic-otel-collector / elasticsearchexporter, not by a custom application log pipeline.
Has anyone encountered this problem? Do you know of a solution?