Noise Word handling on ingest

A static set of noise words is possible to configure but can be lengthy and won't adapt to changes in content over time.

We do have an aggregation designed to spot anomalies when compared to a background set of documents (eg. today's docs vs last week's). This thread might be of interest but note the major caveat - this diffing analysis is not possible if you use time-based indices and the content for "today" is on a machine remote from the rest-of-time content which you want to compare against.