Effect of cardinality of the fields (stored=false) for purposes of changing text analysis in a linguistic context

I have a corpus of content (documents) that are multilingual ( around 25 languages) with around 3-4 multi-lingual fields (title, text, description, keywords). We are considering both

  1. a language per field approach and
  2. a language per index approach

The querying scenario also does not always involve a single language as the querying person is interested in documents in a few languages.

I would like to know the effect of having around 100 fields getting added to a document.

  1. Does it cause any memory overhead (field caches etc.?)
  2. Is there an impact on the mere presence of a large number of null fields and the impact on the index itself.
  3. Do the analyzed, tokenized terms get stored on a per field level?

If I were to go with the multi-index model (language per index) what would be the impact of having 200 million+ docs distributed over 25 indexes, of which a few indexes would have the bulk of the documents (70%) .

Which would be a preferred model. The index/language model avoids having to craft queries across different fields across languages, and seems more easily manageable.

Can anyone throw some light on this?

Thanks
Swami