Hi,
I think what you are describing is often referred to as the "Near Duplicate Detection" problem in the literature. I was involved in a project that made good experiences with shingling approaches similar to the one described in A. Broder "Filtering near-duplicate documents".
Having said that, which part are you missing about "More Like This"? I'd imagine getting the top N MLT documents and then computing some simple set similarity (e.g. on the term vector of specific fields) will get you some way.