Does the Ingest node act as a queue?

I'm designing a new Elasticsearch cluster. Normally I'd have some sort of queue (e.g., Kafka, redis, SNS/SQS, etc.) where all the indexing requests would get queued up and a process would feed the indexing requests into ES via the data node. With the new Ingest Node type, I'm curious about the following:

  1. Can the ingest node act as a queue, simplifying the architecture? Thus not needing something like Kafka, redis or SNS/SQS.
  2. Does the ingest node persist the requests before they are transformed? So that it can handle the case where it crashes or needs to be re-indexed?
  3. If so, how many requests per second can it handle? I'm sure this is subjective based on the system specs and rules applied but wondering if there is a general sense.

Thanks,

Scott

It doesn't replace a broker, no.
I am not sure on the rest though sorry, but I don't believe it persists before transformation. Hopefully someone else can clarify.

The ingest node sits on the bulk/index APIs, which are synchronous. Those APIs are not a queueing mechanism: they exert backpressure when not able to keep up with indexing requests. If you're sending data directly to an ingest node, you need to wait until you receive an acknowledgement to ensure the validity of your indexing process. This means your indexing application needs to be able to handle backpressure if talking with Elasticsearch directly.

It does simplify some architectures, since some applications can handle this backpressure fine. Filebeat is an example of an application that can handle backpressure. This simplifies the simple cases where people were doing things like Filebeat→Logstash→Elasticsearch or the even-more-complicated Filebeat→Logstash→external queue→Logstash→Elasticsearch to become simply Filebeat→Elasticsearch.

If your application can't handle backpressure, you can defer the "wait for acknowledgement" function to another component that can, e.g. Logstash, and you can put a persistent queue between.

1 Like