Struggling to Upsert only one field of a document

Hello,

I'm using Elasticsearch to store billions of data points, each with four key fields:

  • value
  • type
  • date_first_seen
  • date_last_seen

I use Logstash to calculate an mmh3 ID for each document based on the type and value. During processing, I may encounter the same type and value multiple times, and in such cases, I only want to update the date_last_seen field.

My goal is to create documents where date_first_seen and date_last_seen are initially set to @timestamp, but upon subsequent updates, only date_last_seen should be updated. However, I am struggling to implement this correctly.

Here's what I currently have in my Logstash configuration:

input {
  rabbitmq {
    ....
  }
}

filter {
  mutate { 
    remove_field => [ "@version", "event", "date" ] 
    add_field => { "[@metadata][m3_concat]" => "%{type}%{value}" } 
  }

  fingerprint {
    method => "MURMUR3_128"
    source => "[@metadata][m3_concat]"
    target => "[@metadata][custom_id_128]"
  }

  mutate {
    add_field => { "date_last_seen" => "%{@timestamp}" }
  }

  mutate { remove_field => ["@timestamp"] }
}

output {
  elasticsearch {
    hosts => ["http://es-master-01:9200"]
    ilm_rollover_alias => "data"
    ilm_pattern => "000001"
    ilm_policy => "ilm-data"
    document_id => "%{[@metadata][custom_id_128]}"
    action => "update"
    doc_as_upsert => true
    upsert => {
      "date_first_seen" => "%{date_last_seen}",
      "type" => "%{type}",
      "value" => "%{value}",
      "date_last_seen" => "%{date_last_seen}"
    }
  }
}

This configuration isn't working as intended. I have tried using scripting, but given that my Logstash instance processes 8k documents per second, I'm unsure if this is the most efficient approach.

Could someone provide guidance on how to properly configure this to update only the date_last_seen field on subsequent encounters of the same type and value, while keeping date_first_seen unchanged?

Any help would be greatly appreciated!

Thanks!