I have been searching the web and elastic forum looking for solution, but it doesn't seem like people are having problem with the Transform function.
I'm using Transform to create summary tables to optimize queries. It's working if i were to create a new transformation. However, everyday a newly index with the same index pattern will be added to elasticsearch. I knowing that there is the capability to continuously perform transformation by checks for changes to source indices continuously . How should i configure to make it working?
For e.g.
The following are the indices, on each day, a new index will be added.
snort-2020-06-01
snort 2020-06-02
snort 2020-06-03
....
snort 2020-06-30 (newly added)
When doing transform, i chose the index pattern "snort*".
I want the "continuous transform function" to pick up new index pattern and update the transformed index.
Your approach looks fine to me, you can use index patterns with wildcards as source of the transform, an alias that points to multiple indices should work as well.
If you are using date_histogram in your group_by it's advised to use at least 7.7, which introduced an optimization for this case.
If you need specific help, please post your job configuration or at least the parts you have questions.
Thanks for your reply, but my new indices are not being picked up by the transformed_snort job. Currently i'm just using one node for elasticsearch, but i suppose it will not affect the transformation job?
In part 2 of the configuration, I suppose the Date field '@timestamp' refers to the timestamp in the index pattern(snort*)?
As per my requirement, I do not have any @timestamp in my transformed_snort except @timestamp.max and @timestamp.min for aggregations.
When using the kibana UI you are using kibana index patterns. I am not sure, why its not picked up. Can you check source in dev console:
GET _transform/{name}
Maybe the kibana index pattern is not setup correctly, but it looks ok to me. How do you know its not picking up new indices? Does visualizations work using the same pattern?
For a continuous transform you specify the timestamp field in the source index, the suggested @timestamp looks ok to me.
Everyday there will be new index created. I found that "doc count" and "size" of the transformed_snort did not increase, and suspected something when wrong with the continuous transform.
Visualization of the index pattern(snort*) is working well for me.
The stats do not contain any error (see the counters for _failures), the checkpoint is only 2, which means it created only 1 more, however trigger count is 230, so it at least checked > 200 times for updates.
The time_upper_bound_millis corresponds to 07/01/2020 @ 1:48pm (UTC). The data you are adding is not before that?
The concept of a continuous transform is to continually increment and process checkpoints as new source data is ingested. The timestamp used for synchronizing source and destmust follow real time, meaning it must be a recent timestamp. To adjust for index delays, e.g. because the timestamp you use runs behind due to processing delays, you can use the delay parameter, default 60s. That causes transform to deduct the delay when querying data, e.g. lt now-delay.
If you process historic data, there is no need to use a continuous transform, but you can use a batch transform. Is there a reason you want to process historic data but still use continuous mode?
There is a trick, instead of using the historic timestamp, you can add an ingest timestamp while you are feeding in new data, here is how. You than use the ingest timestamp for sync, you can keep your timestamp_max and timestamp_min as is.
Thank you for clarifying. I am testing out a solution that maybe deployed in production, so i wanted to mimic it as far as possible but i do not have access to real-time data.
But I think the "trick" by creating a document_created_datetime will work for both dev/prod.
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.