I read yesterday's blog post about the future of a "stateless" Elasticsearch architecture, where one of the big benefits is the ability to scale indexing workloads separately from search workloads...
I have a 30+ node Elastic cluster that ingests (and deletes) 6 billion records every day, and it’s been a constant struggle to keep the indexing workload from interfering with search performance. So the new architecture sounds like EXACTLY what we need.
I read it the same way, it looks like it will be a Elastic Cloud only feature.
Since it will rely in a search model similar to searchable snapshots, if it lands in the self-managed version it will probably be on the Enterprise license.
Or maybe this could be seen as a hint that Elastic will focus on the cloud offering and the self-managed version will lack some features in the future.
I think this idea is kind of interesting, from a high-level architecture overview, it seems very similar to: thanos, which has shown the feasibility of this type of architecture (even as user deployable).
One thing I'd be curious about, are the intended use cases for this new architecture. The blog post doesn't really state the exact use-cases this setup is intended for but seems to imply that its mainly for logs/metrics/append-only style use-cases.
I'd also be interested in seeing the searching side benchmarks. While the indexing improvements are pretty significant, it would be interesting to know if there are any trade-offs for the searching side.
I'm curious about where this will leave on-prem customers. It's currently less expensive for my organization to run Elastic on servers we maintain. This could potentially change costs enough to make a difference for us.
From me point of view, the high availability and data replication / horizontal scaling is a must-have no matter what the architecture is.
I am curious to see how this would be adressed based on the simple architecture diagram here: Stateless — your new state of find with Elasticsearch | Elastic Blog
If all these criteria are met and the respons time / speed is improved, this would be perfect.
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.