Hi,
I have a strange issue which came to light after starting to use datastreams (and thus create events instead of updates).
The following fingerprint config we have in logstash:
### Add a fingerprint to prevent duplicate log events.
fingerprint {
concatenate_sources => true
source => ["message","agent.hostname"]
target => "[@metadata][fingerprint]"
method => "SHA1"
key => "deduplication-key"
}
In Elastic This event exists:
{
"_index": ".ds-agl-api-ds-2021.08.25-001447",
"_type": "_doc",
"_id": "c3b600da568b237e44e68f1a5bd718246e58a908",
"_score": 1,
"_source": {
"input": {
"type": "log"
},
"ecs": {
"version": "1.6.0"
},
"log-message": "Configuration cache updated!",
"tags": [
"avs6",
"api-log",
"apigateway",
"asd",
"beats_input_codec_plain_applied"
],
"log-level": "INFO",
"message": "ts: 2021-08-25 10:36:40.769 | logLevel: INFO | appId: AGL | thread: | SID: undefined | TN: undefined | clientIp: undefined | userId: ANONYMOUS | apiType: NANO | api: | platform: | eventType: NONE | message: Configuration cache updated!",
"log": {
"file": {
"path": "/product/AGL/agl-core/logs/agl.log"
},
"offset": 3737176171
},
"api-type": "NANO",
"@version": "1",
"fields": {
"environment": "production"
},
"@timestamp": "2021-08-25T08:36:40.769Z",
"app-id": "AGL",
"user-id": "ANONYMOUS",
"agent": {
"name": "papps1443.prdl.itv.local",
"ephemeral_id": "1daa9993-bf4e-4ce0-bc00-bb3762c88820",
"version": "7.10.2",
"hostname": "papps1443.prdl.itv.local",
"id": "139e78fb-a5d6-47f9-813d-7d63e08b5d32",
"type": "filebeat"
},
"event-type": "NONE"
},
"fields": {
"log-message": [
"Configuration cache updated!"
],
"app-id": [
"AGL"
],
"api-type": [
"NANO"
],
"event-type": [
"NONE"
],
"user-id": [
"ANONYMOUS"
],
"input.type": [
"log"
],
"log.offset": [
3737176171
],
"fields.environment": [
"production"
],
"agent.hostname": [
"papps1443.prdl.itv.local"
],
"message": [
"ts: 2021-08-25 10:36:40.769 | logLevel: INFO | appId: AGL | thread: | SID: undefined | TN: undefined | clientIp: undefined | userId: ANONYMOUS | apiType: NANO | api: | platform: | eventType: NONE | message: Configuration cache updated!"
],
"tags": [
"avs6",
"api-log",
"apigateway",
"asd",
"beats_input_codec_plain_applied"
],
"agent.type": [
"filebeat"
],
"@timestamp": [
"2021-08-25T08:36:40.769Z"
],
"agent.id": [
"139e78fb-a5d6-47f9-813d-7d63e08b5d32"
],
"ecs.version": [
"1.6.0"
],
"log-level": [
"INFO"
],
"log.file.path": [
"/product/AGL/agl-core/logs/agl.log"
],
"@version": [
"1"
],
"agent.ephemeral_id": [
"1daa9993-bf4e-4ce0-bc00-bb3762c88820"
],
"agent.name": [
"papps1443.prdl.itv.local"
],
"agent.version": [
"7.10.2"
]
}
}
In my logstash log, this error exists:
[2021-08-25T10:36:42,590][WARN ][logstash.outputs.elasticsearch]
Failed action {
:status=>409,
:action=>[
"create",
{
:_id=>"c3b600da568b237e44e68f1a5bd718246e58a908",
:_index=>"agl-api-ds",
:routing=>nil
},
{
"input"=>{
"type"=>"log"
},
"ecs"=>{
"version"=>"1.6.0"
},
"log-message"=>"Configuration cache updated!",
"tags"=>[
"avs6",
"api-log",
"apigateway",
"asd",
"beats_input_codec_plain_applied"
],
"log-level"=>"INFO",
"message"=>"ts: 2021-08-25 10:36:40.769 | logLevel: INFO | appId: AGL | thread: | SID: undefined | TN: undefined | clientIp: undefined | userId: ANONYMOUS | apiType: NANO | api: | platform: | eventType: NONE | message: Configuration cache updated!",
"log"=>{
"file"=>{
"path"=>"/product/AGL/agl-core/logs/agl.log"
},
"offset"=>2101471568
},
"api-type"=>"NANO",
"@version"=>"1",
"fields"=>{
"environment"=>"production"
},
"@timestamp"=>2021-08-25T08: 36: 40.769Z,
"app-id"=>"AGL",
"user-id"=>"ANONYMOUS",
"agent"=>{
"name"=>"papps1632.prdl.itv.local",
"ephemeral_id"=>"740640a2-daed-4807-9ff6-55bfdabeb066",
"version"=>"7.10.2",
"hostname"=>"papps1632.prdl.itv.local",
"id"=>"b0d97044-dc4d-4778-8a63-a64368a9b26c",
"type"=>"filebeat"
},
"event-type"=>"NONE"
}
],
:response=>{
"create"=>{
"_index"=>".ds-agl-api-ds-2021.08.25-001447",
"_type"=>"_doc",
"_id"=>"c3b600da568b237e44e68f1a5bd718246e58a908",
"status"=>409,
"error"=>{
"type"=>"version_conflict_engine_exception",
"reason"=>"[c3b600da568b237e44e68f1a5bd718246e58a908]: version conflict, document already exists (current version [1])",
"index_uuid"=>"1U0Gff1CQVab4aaDbTYqLQ",
"shard"=>"0",
"index"=>".ds-agl-api-ds-2021.08.25-001447"
}
}
}
}
Please consider the agent hostname in both situations. Both messages came from a different server. How is it possible this generates the same ID?