We are using Elasticsearch latest transforms to fetch latest records in our index based on sync time as ingest.time
and sort as extractDatetime
.
The transformation involves fields like extractDatetime and ingest.time, where extractDatetime is supposed to represent the time when the data was extracted and ingest.time is the time when data pushed to elasticsearch.
-
We have a record with same key an extractDatetime of Aug 20, 2024 @ 18:30:54.572 and an ingest.time of Aug 21, 2024 @ 10:57:00.468. This record is correctly stored and updated in our index.
-
We have a record with same key an extractDatetime of Aug 20, 2024 @ 18:30:54.572 and an ingest.time of Aug 21, 2024 @ 10:57:00.468. This record is correctly stored and updated in our index
-
Current Behavior: Records with an
extractDatetime
that is older than an existing record are replacing the existing records with newerextractDatetime
values. -
Expected Behavior: Records should only update existing entries if their
extractDatetime
is greater than theextractDatetime
of the existing record.
Please refer the transform config
{
"source": {
"index": [
"sample-v3"
],
"query": {
"match_all": {}
}
},
"dest": {
"index": "sample-v2"
},
"frequency": "2m",
"sync": {
"time": {
"field": "ingest.time",
"delay": "120s"
}
},
"latest": {
"unique_key": [
"id",
"usecase1",
"usecase2"
],
"sort": "extractDatetime"
},
"settings": {},
"retention_policy": {
"time": {
"field": "extractDatetime",
"max_age": "30d"
}
}
}
Is there a known issue with the latest transform functionality that could cause records with older extractDatetime
values to replace those with newer values?
Could the sync.time
configuration or delay settings be influencing this behavior?
What additional checks or configurations are recommended to ensure correct record updates?