ES-Hadoop configuration parameter es.read.source.filter doesn't accept a list

When I try to limit my ES results using Spark and the Python API I am able to retrieve only the desired fields with:
es_conf = {"es.nodes": "localhost", "es.port" : "9200", "es.read.source.filter": "some_field"}

But if I try to query for more than one field e.g.

es_conf = {"es.nodes": "localhost", "es.port" : "9200", "es.read.source.filter": ["some_field", "another_field]}

I get:

An error occurred while calling z:org.apache.spark.api.python.PythonRDD.newAPIHadoopRDD.
: java.lang.ClassCastException: java.util.ArrayList cannot be cast to java.lang.String
at org.apache.spark.api.python.PythonHadoopUtil$$anonfun$mapToConf$1.apply(PythonHadoopUtil.scala:160)
at org.apache.spark.api.python.PythonHadoopUtil$$anonfun$mapToConf$1.apply(PythonHadoopUtil.scala:160)
at scala.collection.Iterator$class.foreach(Iterator.scala:893)
at scala.collection.AbstractIterator.foreach(Iterator.scala:1336)
at scala.collection.IterableLike$class.foreach(IterableLike.scala:72)
at scala.collection.AbstractIterable.foreach(Iterable.scala:54)
at org.apache.spark.api.python.PythonHadoopUtil$.mapToConf(PythonHadoopUtil.scala:160)
at org.apache.spark.api.python.PythonRDD$.newAPIHadoopRDD(PythonRDD.scala:580)
at org.apache.spark.api.python.PythonRDD.newAPIHadoopRDD(PythonRDD.scala)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at py4j.reflection.MethodInvoker.invoke(MethodInvoker.java:244)
at py4j.reflection.ReflectionEngine.invoke(ReflectionEngine.java:357)
at py4j.Gateway.invoke(Gateway.java:280)
at py4j.commands.AbstractCommand.invokeMethod(AbstractCommand.java:132)
at py4j.commands.CallCommand.execute(CallCommand.java:79)
at py4j.GatewayConnection.run(GatewayConnection.java:214)
at java.lang.Thread.run(Thread.java:748)

According to the documentation I should be able to provide a list.

Can anyone help me with this seemingly simple task?

The documentation should have been a bit more explicit or descriptive or provided an example cause it was simple but confusihng. Placing the desired fields in a comma separated string worked.

es_conf = {"es.nodes": "localhost", "es.port" : "9200", "es.read.source.filter": "some_field, another_field"}

Sorry about the confusing docs, I'll push an update to them to make them more clear. Thanks for your patience!

Thanks! Really enjoy the community -- The products, the effort, and the responsiveness. I hope in the future I'll have time to help out.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.