As a test, I have a very basic mapping that consists of an id field and a "colors" text field. I want the colors field to only hold one of each type of color. If a color already exists in the colors field, I don't want to add multiple values to that field.
To experiment, I've made two Python methods to add and remove multiple colors:
def remove_multiple_items(id):
doc = defaultdict(dict)
doc['script'] = defaultdict(dict)
doc['script']['lang'] = "painless"
doc['script']['source'] = "params.colors.forEach(c -> {ctx._source.colors.removeIf(e -> e.equals(c))})"
doc['script']['params'] = {"colors": ["orange","green","black","white"]}
update_doc(id=id, data=doc)
def add_multiple_items(id):
doc = defaultdict(dict)
doc['script'] = defaultdict(dict)
doc['script']['lang'] = "painless"
doc['script']['source'] = "params.colors.forEach(c -> {ctx._source.colors.add(c)})"
doc['script']['params'] = {"colors": ["orange","green","black","white"]}
update_doc(id=id, data=doc)
The issue with the "add_multiple_items" method is that it will add multiple colors even if the color already exists. Here are the two main questions:
-
Is it possible in Elasticsearch to specify a fieldtype as a "set" where the field will only contain one of each element?
-
If a set type is not implemented, what is the best method to treat a field as a set? Would I need to put logic in the painless script to check the existing values, put those values into a list and then as I add each new color, check the list to see if that color already exists?
-
Is there a "debug" command for ctx that will print out whatever I give it to print? For example, params.colors.forEach( c -> {ctx.print(c)}) ... Ideally there would be a page somewhere that listed all of the ctx commands that are available. I'm basically looking for some type of println, logging, etc. feature if it exists.
-
I'd like to see an example of a painless script that will take all elements in an array and reduce them to a set. For example, if ctx._source.colors has ["yellow","orange","yellow","black"] it would be nice to see an example of a script that will reduce ctx._source.colors to ["yellow","orange","black"]