Es.update.script.inline and es.write.operation


(Walker Rowe) #1

Trying to update a document using the elasticsearch-hadoop jar and Spark.

This works find:

´´´
curl -XPOST --header 'Content-Type: application/json' http://localhost:9200/walker_pandas/_doc/1/_update -d '{
curl -XPOST --header 'Content-Type: application/json' http://localhost:9200/walker_pandas/_doc/1/_update -d '{
"script": {
"lang": "painless",
"inline": "ctx._source.last = params.last",
"params": {
"last": "Frost"
}
}
}
´´´

This does not:

´´´
j = {
"script": {
"lang": "painless",
"inline": "ctx._source.last = params.last",
"params": {
"last": "Frost"
}
}
}

esconf={}
esconf["es.mapping.id" = 1 ]
esconf["es.nodes"] = "localhost"
esconf["es.port"] = "9200"
esconf["es.update.script.inline"] = j
esconf["es.write.operation"] = "update"

df.rdd.saveAsNewAPIHadoopFile(
path='-',
outputFormatClass="org.elasticsearch.hadoop.mr.EsOutputFormat",
keyClass="org.apache.hadoop.io.NullWritable",
valueClass="org.elasticsearch.hadoop.mr.LinkedMapWritable",
conf=esconf)
´´´


(Walker Rowe) #2

Also trying something like this:

esconf={}
esconf["id"] = "_id"
esconf["es.nodes"] = "localhost"
esconf["es.port"] = "9200"
esconf["es.write.operation"] = "update"
esconf["es.update.script.params.json"] = j

df.write.format("org.elasticsearch.spark.sql").options(**esconf).mode("append").save("backup_/items")


(James Baiera) #3

The es.update.script.inline property is meant to be the actual inline script. In this case, you should have it set to ctx._source.last = params.last.

As for the es.update.script.paramsparameter, you should set it to last:fieldName to pick up a value from each document that is being written, or last:<Frost> if you want every document to use a constant value in its parameter.

More information should be available here: https://www.elastic.co/guide/en/elasticsearch/hadoop/current/configuration.html#cfg-update