Are Elasticsearch object properties really just flat properties with a namespace?

In the Elasticsearch docs (Object field type | Elasticsearch Guide [8.0] | Elastic) it is stated that object properties internally are essentially just flat properties with a namespace. However when I do this:

    POST storage-index/_doc
    {
      "person": {
        "lastName":"Miller" 
      },
      "person.lastName":"Smith"
    }

The index contains this:

        "_source" : {
          "person" : {
            "lastName" : "Miller"
          },
          "person.lastName" : "Smith"
        }

It becomes even weirder when I query these both return the document:

Object property:

    POST /storage-index/_search
    {
      "query": {
        "query_string": {
          "query": "person.lastName:Miller"
        }
      }
    }

Flat property:

    POST /storage-index/_search
    {
      "query": {
        "query_string": {
          "query": "person.lastName:Smith"
        }
      }
    }

What am I missing?

It's all flattened behind the scene. Which does not mean that you can see that in the source.

What does GET /storage-index/_mapping gives?

1 Like

From my understanding you're essentially writing to the same field in two different ways. If you look at the mapping there's only one field for the lastName:

"mappings" : {
  "properties" : {
    "person" : {
      "properties" : {
        "lastName" : {
          "type" : "text",
          "fields" : {
            "keyword" : {
              "type" : "keyword",
              "ignore_above" : 256
            }
          }
        }
      }
    }
  }
}

The way the source is displayed is interesting, since from my understand this is essentially the same document (at least from a search point of view) as this would be:

"_source" : {
  "person" : {
    "lastName" : [
      "Miller",
      "Smith"
    ]
  }
}

I guess it's the only way to handle this without losing the differentiation completely.

Ah, that's a good explanation :slight_smile: . Thanks!

Yeah the source display threw me off here as it definitely proves that there is some data stored in the system that lets it differentiate between the way this data came into the index. But that solves my problem.

I double-checked that the dynamically created mapping is as you suggested.

There's still some weirdness here. If I update the field, behaviour is different depending on how I do it. Take a look:

POST storage-index/_update/k3qRaH8BanEsnFgaHdHH
{
  "doc": {
    "person.lastName": "Smithy"
  }
}

then

POST storage-index/_update/k3qRaH8BanEsnFgaHdHH
{
  "doc": {
    "person": {
      "lastName": "Miley"
    }
  }
}

Result in the index:

        "_source" : {
          "person" : {
            "lastName" : "Miley"
          },
          "person.lastName" : "Smithy"
        }

It really behaves as if it were a separate field, although the mapping in the index is shown as

        "person" : {
          "properties" : {
            "lastName" : {
              "type" : "text",
              "fields" : {
                "keyword" : {
                  "type" : "keyword",
                  "ignore_above" : 256
                }
              },
              "analyzer" : "default"
            }
          }
        },

and no other mapping for a field named lastName anywhere in the index.

That's expected again.

There's is a difference between how the _source is processed and stored AND the way everything gets indexed.

But why should that influence how update behaves? If the two examples are just different syntax for the same thing I would expect the update to overwrite the entire field's value but it doesn't. Depending on what syntax I use for the update, one or the other is overwritten, which isn't consistent with this internally being an array. This is repeatable. I checked.

If I insert an array for lastName and then do the same update (with a single string value), the whole array is replaced with the single String.

I see what you meant. I missed that point initially.

@jpountz do you think this should be viewed as a bug?

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.