Clients sending invalid docs

I have a bunch of clients that are sending documents directly to my ES cluster via bulkIndex. Those documents are valid for 1.x, but are now invalid with 2.x. Specifically, a lot of the documents had periods in field names. Sadly, it won't be trivial or quick to upgrade all those clients.

My current plan is to stand up a separate 2.3 cluster. I can use logstash's elasticsearch input/output plugins to copy over all the existing indices. My problem is, new data is being written all the time with the now-invalid documents. Is there anything in logstash that will watch an index and slurp in just the data? All the indicies are daily indicies; at worst I could setup a cron job to copy over the current day's indices every N time period. But I'd rather have it react to real traffic.

The ES input can't do streaming like that, it's batch only.

Ok, is there any way that ES Watcher helps? My two other options that I can see are (short of upgrading every client, which again, is out of my control):

  1. writing some sort of http proxy that intercepts and mangles the request before it hits ES
  2. awake the ES input every N minutes and re-index everything. If I just limit it to the past 24-48 hours, I think that is feasible. I'm not sending that much data. Maybe a 1 gig a day at most.

Thoughts on either option or what I could have done to keep me out of this hole? I guess I could have gone directly to logstash for everything but as there was no transformation needed at the time, it seemed wasteful. And to be honest, Elasticsearch performed a lot more reliably IME than Logstash.

It could, might be a bit complicated though.

I'd do that.