Ingest vs Data

I have read the Node Roles documentation, but am still unclear on the difference between the ingest and data roles. In particular, if a node has the data role but not the ingest role, does that mean that incoming data (e.g., from a Logstash pipeline) will not be written to that node? I'm asking because I have a situation where I want to temporarily stop incoming data to a hot data node (which currently has both ingest and data roles), but I still want to be able to work with the data on that node. How best do I achieve this?

The ingest role means that the node is able to run elasticsearch ingest pipelines processors to change or enrich the data before indexing it in one or more data nodes.

Ingest nodes can execute pre-processing pipelines, composed of one or more ingest processors. Depending on the type of operations performed by the ingest processors and the required resources, it may make sense to have dedicated ingest nodes, that will only perform this specific task.

If you remove the ingest role from the node that currently is both a data and ingest node, you will still be able to work with the data on that node, and add more data if needed, the change will be that after you remove the ingest role, the node will note be able to run ingest pipeline processors.

The ingest role name is related only to the ability to execute ingest pipeline processors.

2 Likes

Thank you for the info! If I may ask one follow-up regarding my particular challenge: how do I (or can I) leave a hot data node joined to the cluster, but stop it from receiving new data - while at the same time re-allocating data from that "stopped" node to others? My objective is to get all the data off these nodes. I was hoping I could do this by removing the ingest role, but from your response, that doesn't seem like the solution.

Hi Tim, if you want to decommission one specific node, leaving it inside the cluster, you can use this command (with the proper IP addess) :

PUT _cluster/settings
{
  "persistent" : {
    "cluster.routing.allocation.exclude._ip" : "10.0.0.1"
  }
}

This is a dynamic setting (you can run it anytime, it is taken into account quickly). To revert (enable again this node), put the setting value to null
More details in: Cluster-level shard allocation and routing settings | Elasticsearch Guide [8.1] | Elastic

2 Likes

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.