I wonder what's the best practices , in case of a failed transform (for example if a new index was created with wrong mapping or any reason), of completing the missing data.
example:
created a new transform at 23:00
which has the following group by: date histogram of 1 minute, and terms by server , and avg(load) aggregation
at 00 a new a new index was created with the wrong mapping: server should be a keyword, but its a text
the transform failed, and until it's fixed - 10am, data is missing from the dest index
what the recommended way of handling such a case? (can it be automated?)
Transform uses checkpoints, because the transform has failed check pointing does not proceed but is stopped at the time it failed. Once you re-start the transform it will continue from the checkpoint.
To restart the transform you first need to bring it into the stopped state by using
POST _transform/{id}/_stop?force=true
Afterwards you can start it again.
There is no way to automate, because transform treats this error as permanent problem which requires a user to fix it. This is different to the case that a temporary problem occurs, e.g. temporary outage of the node that holds the data. If such a failure happens, transform retries up to 10 times.
This is a good question. The best option right now is afaik using watcher and the http input.
I suggest to configure it against the _transform/{transform_id}/_stats endpoint and check that status != failed.
(Note its called http input but it can speak https)
I will follow up with the team if we can provide a better solution in future, e.g. a transform wide state. Feel free to open a gh issue as enhancement request if you have a concrete idea how it should look like.
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.