I have an index of 30m items; I indexed them using cURL reading a CSV from the command line, as suggested by an article I found on the ES blog. So did not use logstash. The index process took a few days. I do not expect the dataset to subsequently change.
I have since noticed one field has numbers but padded with 0 e.g. "0001", "0012", "0123", "1234". The field is a keyword.
This is causing problems using search with "simple_query_string" to try and match user typed queries where their typed values may appear across a multitude of fields.
What technique would allow me to either transform the data in place, or to have searches consider the field through some sort of filter that removed the leading 0 characters?