Since 8.11.0, when using dynamic mapping, there is a defect preventing the indexation of documents with an array field containing more than 127 strings.
Here is how to reproduce:
1- start Elasticsearch 8.11.0:
docker run -p 9201:9200 -it -m 1GB -e xpack.security.enabled=false -e discovery.type=single-node docker.elastic.co/elasticsearch/elasticsearch:8.11.0
2- create an index
PUT /testindex
body: {}
3- try to insert this doc:
PUT /testindex/_doc/foo
body:
{
"list": [
"foo", "foo", "foo", "foo", "foo", "foo", "foo", "foo", "foo", "foo",
"foo", "foo", "foo", "foo", "foo", "foo", "foo", "foo", "foo", "foo",
"foo", "foo", "foo", "foo", "foo", "foo", "foo", "foo", "foo", "foo",
"foo", "foo", "foo", "foo", "foo", "foo", "foo", "foo", "foo", "foo",
"foo", "foo", "foo", "foo", "foo", "foo", "foo", "foo", "foo", "foo",
"foo", "foo", "foo", "foo", "foo", "foo", "foo", "foo", "foo", "foo",
"foo", "foo", "foo", "foo", "foo", "foo", "foo", "foo", "foo", "foo",
"foo", "foo", "foo", "foo", "foo", "foo", "foo", "foo", "foo", "foo",
"foo", "foo", "foo", "foo", "foo", "foo", "foo", "foo", "foo", "foo",
"foo", "foo", "foo", "foo", "foo", "foo", "foo", "foo", "foo", "foo",
"foo", "foo", "foo", "foo", "foo", "foo", "foo", "foo", "foo", "foo",
"foo", "foo", "foo", "foo", "foo", "foo", "foo", "foo", "foo", "foo",
"foo", "foo", "foo", "foo", "foo", "foo", "foo", "foo"]
}
Elasticsearch fails as follows
{
"error":{
"root_cause":[
{
"type":"parsing_exception",
"reason":"Failed to parse object: expecting token of type [VALUE_NUMBER] but found [VALUE_STRING]",
"line":1,
"col":10
}
],
"type":"document_parsing_exception",
"reason":"[1:10] failed to parse: Failed to parse object: expecting token of type [VALUE_NUMBER] but found [VALUE_STRING]",
"caused_by":{
"type":"parsing_exception",
"reason":"Failed to parse object: expecting token of type [VALUE_NUMBER] but found [VALUE_STRING]",
"line":1,
"col":10
}
},
"status":400
}
And in the mappings, you see this new mapping created:
"mappings": {
"_doc": {
"properties": {
"list": {
"dims": 128,
"similarity": "cosine",
"index": true,
"type": "dense_vector"
}
}
}
},
Instead we should have the same mapping as in previous versions:
"mappings": {
"_doc": {
"properties": {
"list": {
"type": "text",
"fields": {
"keyword": {
"ignore_above": 256,
"type": "keyword"
}
}
}
}
},
If the array contains less than 128 items, the indexation works fine.
Is it a defect or a new undocumented limitation?