Heya, I'm not sure if this is a bug with elastic itself or with the elasticsearch_dsl python library, so posting here before filing an issue on github.
If you set up a token filter with a boolean parameter, that parameter gets stringified:
PUT test
{
"settings": {
"analysis": {
"filter": {
"test": {
"ignore_case": true,
"type": "stop",
"stopwords": [
"h",
"n",
"t"
]
}
},
"analyzer": {
"test": {
"type": "custom",
"tokenizer": "standard",
"filter": [
"test"
]
}
}
}
}
}
GET test/_settings
{
"test" : {
"settings" : {
"index" : {
"number_of_shards" : "1",
"provided_name" : "test",
"creation_date" : "1609856872062",
"analysis" : {
"filter" : {
"test" : {
"ignore_case" : "true",
"type" : "stop",
"stopwords" : [
"h",
"n",
"t"
]
}
},
"analyzer" : {
"test" : {
"filter" : [
"test"
],
"type" : "custom",
"tokenizer" : "standard"
}
}
},
"number_of_replicas" : "1",
"uuid" : "ebu4gTVqRbW6dEgleJCECA",
"version" : {
"created" : "7100099"
}
}
}
}
}
You'll notice "ignore_case": true,
becomes "ignore_case" : "true",
.
The filter does work as intended, however, elasticsearch_dsl trips over this type change:
from elasticsearch_dsl import Document, analyzer, token_filter, Text
class ExampleDocument(Document):
class Index:
name = "test2"
using = "es7_default"
test = Text(
analyzer=analyzer(
"test",
tokenizer="standard",
filter=[
token_filter("test", ignore_case=True, type="stop", stopwords=["h", "n", "t"]),
],
)
)
Running ExampleDocument.init()
the first time succeeds, however running it again results in:
/vendor/elasticsearch_dsl/index.py in save(self, using)
320 for k in analysis[section]
321 ):
--> 322 raise IllegalOperation(
323 "You cannot update analysis configuration on an open index, "
324 "you need to close index %s first." % self._name
This error is thrown because, a few lines above, this check is made, and fails:
Where:
existing_analysis.get(section, {}).get(k, None) == "true"
analysis[section][k] == True
So there's an easy workaround: changing ignore_case=True
to ignore_case="true"
. However, it seems like:
- either elasticsearch_dsl should handle this (de)serialization properly, or
- elasticsearch itself shouldn't change the type of the boolean parameter