This might be useful to you:
Its obviously not done, but it could help fix these locality issues.
I don't have a good idea how much memory each index takes but it certainly does grow linearly. Some of the
You'll probably be the only person doing it so you are likely to hit new and fun bugs no one has thought of. There are folks who have lots of always open indexes - I have two thousand or so - and I can feel cluster state actions being a bit sluggish. I'm still on an old version of Elasticsearch at the moment and I know that some of that has been fixed but I still expect the scale of what you are describing to be "fun".
I think its worth writing a test application that pretends to do what you are going to do - it could be as simple as a bash script that creates you million indexes, jams 20 docs in each, and closes them. Measuring memory consumption in Java is hard but you should be able to try and simulate your task and test it. I'd advise against doing what you plan to do without first proving it out that way.
Or go comment on the issue I linked above and maybe pitch in there somehow.