Is there a way to allow dots in the fieldname?

The title says it all.

We are using ES 5.3.

Dots are allowed:

POST /test/test/
{
  "foo.bar.baz": "bizz buzz"
}


{
    "_index": "test",
    "_type": "test",
    "_id": "AVzHK7SvBQImgzk991KA",
    "_version": 1,
    "result": "created",
    "_shards": {
        "total": 2,
        "successful": 1,
        "failed": 0
    },
    "created": true
}
GET /test/test/AVzHK7SvBQImgzk991KA

{
    "_index": "test",
    "_type": "st",
    "_id": "AVzHK7SvBQImgzk991KA",
    "_version": 1,
    "found": true,
    "_source": {
        "foo.bar.baz": "bizz buzz"
    }
}

Here is what I get. Please note that I don't want the data to be nested. By default, if you use dots, ES uses nested type.

PUT /fieldnamestest/mytype/1
{
    "a...b" : "test"
}

RESPONSE:

{
   "error": {
      "root_cause": [
         {
            "type": "mapper_parsing_exception",
            "reason": "failed to parse"
         }
      ],
      "type": "mapper_parsing_exception",
      "reason": "failed to parse",
      "caused_by": {
         "type": "illegal_argument_exception",
         "reason": "object field starting or ending with a [.] makes object resolution ambiguous: [a...b]"
      }
   },
   "status": 400
}

Ah, yeah, that's not allowed :slight_smile:

To clarify, dots only signify "objects", unless you've explicitly mapped it to nested. E.g. this:

{
  "foo.bar.baz": 123
}

is equivalent to:

{
  "foo": {
    "bar": {
      "baz": 123
    }
  }
}

But it's just an Object Type, not a Nested Type unless you explictly map it so. See: https://www.elastic.co/guide/en/elasticsearch/guide/current/nested-objects.html for more details

But getting back to your question, you're right that dots signify object heirarchy, so you can't use them in the manner shown above, as just regular characters. Sorry, I misunderstood your original question.

@polyfractal

No worries. I tried escaping the dots and it seems to work. Is that a good solution?

PUT /fieldnamestest/mytype/1
{
    "a\\.\\.\\.b" : "test"
}

Also, is this a valid configuration in version 5.x? See this document for details. I tried it and it does not work for 5.x, but works for 2.4.x

export ES_JAVA_OPTS="-Dmapper.allow_dots_in_name=true"

That "works", but it's probably not quite what you want. If you check the mappings you'll see:

GET /fieldnamestest/_mapping

{
    "fieldnamestest": {
        "mappings": {
            "mytype": {
                "properties": {
                    "a\\": {
                        "properties": {
                            "\\": {
                                "properties": {
                                    "\\": {
                                        "properties": {
                                            "b": {
                                                "type": "text",
                                                "fields": {
                                                    "keyword": {
                                                        "type": "keyword",
                                                        "ignore_above": 256
                                                    }
                                                }
                                            }
                                        }
                                    }
                                }
                            }
                        }
                    }
                }
            }
        }
    }
}

E.g. it's not really escaping the dots, it's escaping the slashes and then using the slashes as object names. Fundamentally it doesn't really matter, since the object name is "flattened" in Lucene to "a\\.\\.\\.b" anyway, but it's good to be aware of because "a\\.\\.\\.c" will technically be considered a sub-object of "a\\.\\.\\".

If possible I'd probably avoid doing that, since having slashes as an object name is bound to be confusing, and potentially made illegal in future versions of Elasticsearch (we'd like to lock down field name validation in the future, to prevent things like unprintable characters, etc)

That's an old setting, and I don't think it's available in 5.x anymore. It was used because, in versions 2.0 - 2.4, dots were completely forbidden. E.g. foo.bar.baz was not allowed, you had to write foo_bar_baz. The restriction was loosened in 2.4, and allow_dots_in_name was added to expose that loosened restriction.

I believe it went away in 5.x because the default is to allow dots. But note, the dots are what we discussed above... objects in objects, etc.

You can read more about that change here: Mappings: Allow to force dots in field names by rjernst · Pull Request #19937 · elastic/elasticsearch · GitHub

Thanks so much for the detailed explanation. The escape characters won't work because of the hierarchical mapping.

I guess the best way is to simply avoid the dots in the fieldnames. Are there any other illegal characters in fieldnames that I should be aware of?

1 Like

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.