Why datanodes have sharding metadata?


According to my understanding, datanodes and masternodes have division of responsibility. I still wonder why datanodes do store the metadata of shards ? Is this not redundant ?
Is it not okay if we keep the metadata restricted to the masternodes itself, with every datanode querying the masternode for this information.

Any node in the cluster can act as a coordinating node and therefore need to know the location of all shards. Data nodes also need access to the mappings. As this data is accessed frequently, keeping it in local memory is vital for performance.

Hi, but isn't this highly redundant. This information is definitely something which can be obtained on demand from the master. I understand that solving for latency is definitely the priority.


Please be patient and do not bump that quickly. This forum is manned by volunteers that generally also have a day job.

This information is stored on all nodes for for performance reasons. As it is accessed frequently, introducing network hops would add a lot of unnecessary network traffic and latency.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.