I know we can have an alias on a rolled up and raw index (which logically does not make sense), but I created one anyway.
so I have a dashboard sourcing from an alias comprising of raw and rolled-up indices. but I have chosen rolled up fields (as I recall that was the only option).
Then the question is why allow an alias on raw and rolled-up indices when kibana can source from only one? am i missing something here?
My use case: data is ingested at millisecond level and i have a roll up job that runs every 15 min and rolls up data to minute level. I don't think it is possible to show millisecond level data and raw data at the same time (logically does not make sense) but out of curiosity i wanted to experiment. Before i put my experiment in failure bucket i want to get other thoughts.
@hsalim Thanks for your question! The fields are restricted to those from the rolled up index because all data that's queried will then be available from both indices.
If we flipped it around and the fields from the raw index were available, then theoretically, you'd be able to search against "missing data" from the rolled up index.
More generally, I'm not sure why you'd want an alias pointing to both the raw and rolled up data because that defeats the purpose of creating a rollup:
You can now use the rolled up data for analysis at a fraction of the storage cost of the original index.
Thanks, very helpful! Yes, I agree with you that there is no need to have an alias pointing to both raw and rolled up data. And very significant storage savings will be realized - to be quantified.
Let me expand on our use case. we have millions of transactions arriving every minute, and rollups are happening every 15 min. so we wanted our dashboards to show near real time data at minute level. One obvious option is to reduce rollup time from 15 to 1 min, implying (in theory) that rollups will always be running. I need to test this to see what (if any) impact this might have on our system. Do you see any issues with this approach? Any other options? maybe elastic data streams is an option?
@hsalim You can definitely try using data streams.
The other option is to experiment in a dev setup with a completely different cluster. That way, you can slice and dice the data without risking adverse effects in Prod.
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.