Zone-A and Zone-B are two different datacenters.
Node attributes (GET _cat/nodeattrs?v&s=node output) is like this .
Yes we are using allocation awareness
Elasticsearch.yml configuration has the following parameters depending on the zone and type of node (hot/warm)
node.attr.mode: data_node
node.attr.zone: "zone-A"
node.attr.temp: "hot"
The current cluster settings are like this:
{
  "persistent" : {
    "cluster" : {
      "routing" : {
        "allocation" : {
          "awareness" : {
            "attributes" : "zone",
            "force" : {
              "zone" : {
                "values" : [
                  "zone-A",
                  "zone-B"
                ]
              }
            }
          },
          "disk" : {
            "watermark" : {
              "low" : "1200gb",
              "flood_stage" : "150gb",
              "high" : "1200gb"
            }
          }
        }
      },
      "info" : {
        "update" : {
          "interval" : "60s"
        }
      }
    },
    "indices" : {
      "recovery" : {
        "max_bytes_per_sec" : "400mb"
      }
    }
  },
  "transient" : {
    "cluster" : {
      "routing" : {
        "allocation" : {
          "node_concurrent_incoming_recoveries" : "1",
          "cluster_concurrent_rebalance" : "2",
          "node_concurrent_recoveries" : "1"
        }
      }
    },
    "indices" : {
      "recovery" : {
        "max_bytes_per_sec" : "40mb"
      }
    }
  }
}
Also, in the daily indices, we have the following settings, which will create new indices only in hot nodes. Later after 90 days we are moving them to warm nodes.
{
  "time-data-2012.12.13" : {
    "settings" : {
      "index" : {
        "routing" : {
          "allocation" : {
            "require" : {
              "temp" : "hot"
            },
            "total_shards_per_node" : "3"
          }
        },
        "mapping" : {
          "nested_fields" : {
            "limit" : "1000"
          },
          "total_fields" : {
            "limit" : "10000"
          }
        },
        "refresh_interval" : "55s",
        "number_of_shards" : "24",
        "translog" : {
          "sync_interval" : "25s",
          "durability" : "async"
        },
        "provided_name" : "time-data-2012.12.13",
        "merge" : {
          "scheduler" : {
            "max_thread_count" : "3"
          }
        },
        "unassigned" : {
          "node_left" : {
            "delayed_timeout" : "30m"
          }
        },
        "number_of_replicas" : "2"
      }
    }
  }
}
And to give you an idea about the shard allocation happening in nodes, here is the output of GET _cat/allocation?v&s=node API .
Now daily we are manually re-assigning the hot shards from zone-B to zone-A, as zone-B is getting most of the primaries and thus causing load and subsequent indexing problems
How to make the cluster rebalance itself?.