Latest unique transform missing documents

Phil_McLachlan · October 11, 2024, 7:02pm

We have what I believe to be a straightforward transform to pick out unique documents ordered by an ingest pipeline timestamp. We apply it to several programs (customers) index data, but with larger data it is missing documents. With 5000 or less documents in an index it seems to work fine, but with 23000 documents it seems to be missing about 500 to 1000 documents.

Furthermore, I have noticed that when the timestamps are close, it is not picking the latest one. It picks the earliest one.

Here is the transform:

PUT _transform/phils_test_unique
{
  "source": {
    "index": "axp_marketplace_search_catalog_1_1728534453"
  },
  "dest": {
    "index": "phils_test_unique",
		"pipeline": "axp_marketplace_event_ingested_ingest_pipeline"
  },
	"latest": {
		"unique_key": ["product_pk", "catalog_type"],
		"sort":       "event.ingested"
	},
	"description": "",
	"frequency": "5m",
	"sync": {
		"time": {
			"field": "event.ingested",
			"delay": "60s"
		}
	}
}

Can anyone tell me why the resulting index will be missing documents? It seems to be missing the same amount each time it is run.

Phil_McLachlan · October 11, 2024, 9:18pm

Ok, I figured it out. I had to make the input index definition of catalog_type to be of type "keyword", and then I had to remove the keyword suffix from the unique_key in the transform. Now all the documents are in the output index.

system · November 8, 2024, 9:18pm

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Transform results in index with missing documents when using the API but works for console Elastic Search painless , transforms	14	147	August 16, 2024
Using Transform for document count when document updated Elasticsearch transforms	3	46	August 27, 2024
Transform missing documents in continuous mode Elasticsearch	0	19	October 11, 2024
Issue with updation of record in latest Transform Elasticsearch	2	19	August 23, 2024
"latest" continuous transform destination index does not have all unique key docs consistently/continually Elasticsearch transforms	7	352	October 13, 2022

Latest unique transform missing documents

Related topics