Geographic data vs data location

Think about THE big one - google.

(First, China for this example is avoided because much Chinese data is
ILLEGAL to be
provided for search outside of China)

If there is data generated by people in Europe, in various languages:
1/ Is it stored close to where it is generated?
2/ Are sharding and replicatiion also close to where it is
generated?
3/ How accessible IS that data to someone from the US who speaks one
of those languages?
4/ How much is sharding and replication done AWAY from where data is
geographically generated?

What are people's thoughts on making sites that cater to people
interested in web pages, etc in other countries? Any examples out
there?