I have a cluster with 4 nodes , 2 master and 2 data nodes, currently only one master is responding, all the other nodes connect for 10 mins and shard allocation start, then again it becomes NA , not sure what is the issue?

Are both the nodes Es-Master-1 and Es-Master-2 running? You need both nodes to be running and you will also need to create a new master node.

There is not much to do, you need to bring the node that is not running back online and after that add a new master node to have resilience.

Also, with the following configuration the Es-Master-1 node is not an dedicated master, it is also a data node. It is the same for Es-Master-2? And what does the elasticsearch.yml for the Es-Aggr nodes looks like?

From the log you shared it seems that the nodes Es-Master-1 and Es-Master-2 are both master and data nodes, but the nodes Es-Aggr-1 and Es-Aggr-2 are only data nodes.

This setting needs to have only the master eligible nodes, remove the Es-Aggr nodes if they are data only.

Any reason to have changed those settings? The default value is 2, this is way too high and can heavily impact on recoveries.