3 data nodes with node.group=indexA (indexA has "index.routing.allocation.include.group": indexA )
3 data nodes with node.group=indexB (indexB has "index.routing.allocation.include.group": indexB)
4 client nodes running kibana, marvel (no data is stored on these nodes)
I want to isolate the two groups of data nodes from each other.. That ways i can selectively give access to users to the index they are authorized for. Can the cluster be usable if the data+client nodes in one group do not have access to the other group of data+client nodes (by blocking the ports) and vice-versa ??
Allocation of data is not about security. Just where you place the data in the cluster.
A request made on a coordinating node will be able to reach any of the data node.
You want to secure your system? Easiest way: use xpack (commercial license).
You can also use a reverse proxy on top of elasticsearch which allows or not some indices but beware of APIs like bulk, mget, msearch.
Thanks @dadoonet
I was planning on controlling data allocation and then have firewall rules to block the 9300 port between the data and client nodes of the two groups. Was curious, how this affects the state of the cluster? The master node will have access to all nodes as you mentioned.
Also, with xpack/shield, can i have custom logic to map the user, ip, request etc. details to a user or a role in my system to enable authorization (using role based access control)?
A cluster in elasticsearch operates as a whole. Without xpack, any node in the cluster is design to give you information about any index that is stored in the cluster. In other words kibana can connect to any node in the cluster and get information from any index in the cluster. It doesn't have to have access to the node where data is actually resides, other data nodes will be happy to reach out to these node and give data back.
Setting up non-trivial access controls from outside of elasticsearch is a difficult problem. I was trying to explain some of the issues that you need to deal with in this Elastic{ON} presentation, but was able to only scratch the surface. If you really want to separate access from outside of elaticsearch using ip rules, you should roll out two completely isolated clusters.
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.