Ignore malformed


(Roy Russo) #1

We are looking to upgrade to 2.x and then 5.x. In the past, we had to fork the ES-Hadoop library to swallow numberformatexceptions on read, when it tries to cast a string to an int (as per the ES mapping). This is also mentioned here: https://github.com/elastic/elasticsearch-hadoop/issues/663

We are curious if this is still an issue with the ES-Hadoop library, or if a flag has been added to observe the ignore_malformed flag in ES. We can't guarantee perfect customer data, so that flag is turned on in most of our indices. In the past, ES-Hadoop would blow up the whole job on just one improper cast. Is there a workaround?


(system) #2