Are there any reasons to pick one over another to store historical data?
As far as i can see rollups have limited ability for searching and aggregations (no filtered aggregations, etc.) while transformed index doesn't have any of these limitations. It's looks like transformed indexes are more flexible but is there any drawbacks? Performance hit maybe?
Rollup implements the compaction usecase: you want to save storage and be able to access historical data.
Transform's usecase is building entity centric indices, feature creation for machine learning, data analysis.
Both actually share the same foundation, but you are right that transform supports more aggregations and grouping e.g. on terms (id's). Rollup however provides rollup search to seamlessly search on compacted and non-compacted data.
Your usecase sounds like a rollup usecase, because you mention historical data. Note that rollup is experimental, this does not say anything about performance or stability, but about the API's. Long term we plan to integrate rollup into ILM, this will also solve the conceptual problem of transform vs. rollup.
Performance-wise: Both rollup and transform are as fast as aggregations and indexing can go. There is no huge difference between the two.
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.