We are using Elasticsearch latest transforms to fetch latest records in our index based on sync time as ingest.time and sort as extractDatetime.
The transformation involves fields like extractDatetime and ingest.time, where extractDatetime is supposed to represent the time when the data was extracted and ingest.time is the time when data pushed to elasticsearch.
-
We have a record with same key an extractDatetime of Aug 20, 2024 @ 18:30:54.572 and an ingest.time of Aug 21, 2024 @ 10:57:00.468. This record is correctly stored and updated in our index.
-
We have a record with same key an extractDatetime of Aug 20, 2024 @ 18:30:54.572 and an ingest.time of Aug 21, 2024 @ 10:57:00.468. This record is correctly stored and updated in our index
-
Current Behavior: Records with an
extractDatetimethat is older than an existing record are replacing the existing records with newerextractDatetimevalues. -
Expected Behavior: Records should only update existing entries if their
extractDatetimeis greater than theextractDatetimeof the existing record.
Please refer the transform config
{
"source": {
"index": [
"sample-v3"
],
"query": {
"match_all": {}
}
},
"dest": {
"index": "sample-v2"
},
"frequency": "2m",
"sync": {
"time": {
"field": "ingest.time",
"delay": "120s"
}
},
"latest": {
"unique_key": [
"id",
"usecase1",
"usecase2"
],
"sort": "extractDatetime"
},
"settings": {},
"retention_policy": {
"time": {
"field": "extractDatetime",
"max_age": "30d"
}
}
}
Is there a known issue with the latest transform functionality that could cause records with older extractDatetime values to replace those with newer values?
Could the sync.time configuration or delay settings be influencing this behavior?
What additional checks or configurations are recommended to ensure correct record updates?