What you want to do is certainly possible but it won't be easy or cheap. You do have a couple advantages though with a static data set and relatively generous query response requirements but you're still talking about a large cluster no matter how you slice it. Even if you reduce it to 400TB you still have to consider replicas which could push you back into the PB range depending on your requirements.
So many considerations here that really depend on exactly what the data looks like and what capabilities the cluster needs to be able to provide. For instance if you need to run aggregations on that much data you'll have very different considerations (and expense) vs. just basic searches.
So you need to start with the data and think about what it looks like on a per doc basis, how it needs to be indexed and how it needs to be searched. From there you can start to project sizing numbers and do index design. In particular consider what logical partitions exist in the data so that you can segment it into smaller indices. If you have the luxury of it being time series data definitely leverage that, otherwise consider how else it can be partitioned. A single 400TB index will be a nightmare to manage and you'll be a very unhappy camper if you were to say lose a shard out of that. Smaller indices give you much more flexibility.
Whether you can do this with a single cluster or not will also depend on exactly what the sizing numbers come out to. You can't run 100TB on a single node so you have to find the right node size for your use case and then project the number of nodes required. If it starts to look like you're going to need hundreds of nodes (and if you use SSDs you probably will) then you may need to consider multiple clusters. Also be sure that any sizing calculation factors in overhead for cluster operations. If you need 1PB of space just for data then you need to factor in overhead on a per node basis to allow for things to stay functional when there are issues in the cluster. The details of that will depend on exactly how many nodes you have and the ultimate size of shards. This is hard to get right.
Once you have an estimate on what you think a node will look like then you should test a single node configuration, load it with a representative amount of data and then figure out if it can handle your query response requirements. ES queries scale surprisingly well as you add nodes but if your nodes are the wrong size everything can fall apart. As you do this you'll also encounter some of the limits that ES has when dealing with truly large amounts of data. Really explore how much data you can actually put on a node, this is heavily affected by the number of shards on the node. ES works best at lower density but with your query requirements you may be able to run higher density if you can size things so it doesn't run out of heap just starting up. That means fewer, larger shards. The fact your dataset is static could really help you here.
Anyway, it's impossible for me to give you any real answers so hopefully this rambling is of some help.