Search for fields with dot in name

Hi,

We are migrating from Es6.8 to latest ES7.
We are using Re-index.

While testing we came across a doc with a malformed fieldname. (the mapping does not allow for the field, but it is there).

Example:

{
  "id": "123456",
  "meta": {
    "field_a": "test",
    "field_b": "test2"
  },
  "meta.field_c": "test3"
}

The field "meta.field_c" should not be there - we don't know how it came to be, it should be inside the "meta" object.

With the reindex this fieldname is no longer allowed - because of the dot.

We want to search for all docs with this field - so that we can manually transfer / delete them.

How would we do that? We've tried to use a script in a query - without luck.

3 Likes

While this isn't pretty, I thought this should work. At least I can index the following successfully:

PUT test/_doc/1
{
  "id": "123456",
  "meta": {
    "field_a": "fieldA",
    "field_b": "fieldB"
  },
  "meta.field_c": "fieldC"
}

My mapping (GET test/_mapping) looks ok as well — what does your's look like?

{
  "test" : {
    "mappings" : {
      "properties" : {
        "id" : {
          "type" : "text",
          "fields" : {
            "keyword" : {
              "type" : "keyword",
              "ignore_above" : 256
            }
          }
        },
        "meta" : {
          "properties" : {
            "field_a" : {
              "type" : "text",
              "fields" : {
                "keyword" : {
                  "type" : "keyword",
                  "ignore_above" : 256
                }
              }
            },
            "field_b" : {
              "type" : "text",
              "fields" : {
                "keyword" : {
                  "type" : "keyword",
                  "ignore_above" : 256
                }
              }
            },
            "field_c" : {
              "type" : "text",
              "fields" : {
                "keyword" : {
                  "type" : "keyword",
                  "ignore_above" : 256
                }
              }
            }
          }
        }
      }
    }
  }
}

And I can also search them correctly:

GET test/_search
{
  "query": {
    "multi_match": {
      "query": "fieldC",
      "fields": [ "meta.*" ]
    }
  }
}

What are the errors you are running into?

Hi @xeraa,
Thank you for the reply.

The mapping we have in this test case looks like this:

    {
      "mappings": {
        "test": {
          "dynamic": "strict",
          "_all": {
            "enabled": false
          },
          "properties": {
            "id": {
              "type": "keyword"
            },
            "meta": {
              "properties": {
                "field_a": {
                  "type": "keyword"
                },
                "field_b": {
                  "type": "keyword"
                }
              }
            }
          }
        }
      }
    }

There is no mapping for the field "meta.field_c" - and afaik it should not be allowed to be indexed at all.

Now since we are doing the reindex into ES 7 - we get an error because the field contains a dot "." in the name.

Searching for docs with the "meta.field_c" has proven to be very difficult. Currently the only solution we can come up with is to traverse all docs and check them programmatically.
We would like to avoid that, since the we would have to do that for billions of docs...

There was a problem with dots in field names and they were not allowed in 2.0 to 2.3 and required a special config for 2.4 if memory serves me correctly. https://www.elastic.co/guide/en/elasticsearch/reference/2.4/dots-in-names.html shows the new behavior since 5.0.

meta.field_c is valid and just a different way to set a property in a subdocument.

In your mapping you have "dynamic": "strict" and since there is no field meta.field_c in your mapping, any document containing that field will be rejected.

Since I assume you want to have that field, you'll need to extend your mapping.

Hello again,

Thx for the reply.

The root problem is that when reindexing from our ES6 to ES7 we get an error because of the name with a dot. - So we want to be able to find the documents with the wrong field and handle them before we do the reindex.

We have not been able to find a way to search for them though.

We have not been able to find a way to search for them though.

Do you have an example what is the problem there?

Because like I said earlier: Since Elasticsearch 5.0, "meta.field_c": "..." is the same as "meta": { "field_c": "..." }.