My question for any Elastic team members reading (who are deeply intimate with Elasticsearch architecture/internals) is whether Elastic (core developers/company) thinks of Elasticsearch as implementing SEDA? If so, how much does it differ from the core tenets of SEDA? For example, are the thread pools exposed via /_cat/thread_pool the same internal queues defined in the SEDA architecture, yes or no? Are there other changes or liberties taken with SEDA due to real world experience? Or does Elasticsearch implement some other model entirely?
Tricky question to answer. We do not, for instance, refer to the SEDA literature on a daily basis basis, but there are certainly some similarities. The way Elasticsearch use thread pools is indeed one such similarity, although SEDA's thread pools seem to be per-stage and Elasticsearch's are much coarser. It's hard to say whether these similarities are directly because of this research or whether it's a case of convergent evolution.
@DavidTurner, got it. 10 years on, I completely understand that you wouldn't refer to the literature on a daily basis (or even regularly) but that's also because those decisions have been baked into Elasticsearch over the past decade. They are the fundamental underpinning of how it works at this point, so there's no need to think about the origin story anymore.
Maybe a better question is whether SEDA influenced Shay's initial design decisions around the architecture of Elasticsearch's distributed systems layer? What does Shay himself say or think on the topic?
Hey, I bumped into Shay and asked him about this, and the answer wasn't clear-cut. Elasticsearch was definitely never intended to be a pure implementation of a SEDA architecture, but the ideas about performance and scalability, particularly in terms of things like avoiding an expensive thread per request, were fairly pervasive in the industry by the time Elasticsearch was born.
So yes, I think it's fair to say that SEDA had some influence (albeit possibly indirectly) on Elasticsearch's overall shape.
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.