I am not even sure if this is a right thing to do also. The fields I am trying to remove have same value over all the records. So I figured that it might be better to remove them altogether. And choose the ones which vary.
You don't need to do this - the datafeed does it for you.
The number of fields that the job has to process depends on the job config. Each field that's mentioned in the job config has to be retrieved, so the only way to reduce the number of fields the datafeed has to retrieve is to mention fewer fields in the job config.
For each field that the job requires the datafeed decides whether it will get the field from doc values or from _source. Then it requests the minimum amount of doc values and _source fields to satisfy those requirements. If it decides to get everything from doc values then it even switches off fetching of _source altogether.
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.