Ability to configure "root" field for an object type

sdavids · March 3, 2019, 10:36pm

I have a mapping definition with multiple object types that looks like:

{
    "mappings": {
        "_doc": {
            "properties": {
                "identifier": {
                    "type": "object",
                    "properties": {
                        "id": {
                            "type": "keyword"
                        },
                        "upstream": {
                            "type": "keyword"
                        }
                    }
                },
                "document": {
                    "type": "object",
                    "properties": {
                        "engine": {
                            "type": "keyword"
                        },
                        "content": {
                            "type": "text"
                        }
                    }
                }
            }
        }
    }
}

This will build fields: identifier.id, identifier.upstream, document.engine, and document.query. I would like to allow additional mappings for the "root" fields e.g. the plain "identifier" and "document" field.

For the "identifier" field I would like to be able to search across all subfields (without requiring users to specify "identifier.*") which I believe should be a simple keyword type with all subfields specifying: "copy_to": "identifier". Likewise, for the "document" field I would like to set it up as an alias field which will point to populated "document.content" field. I tried updating the "type" to the desired field types but was unable to create the index, can someone please let me know if this is achievable.

Update:
I also tried to flatten out the structure myself and do:

"properties": {
    "identifier": {
        "type": "keyword"
    },
    "identifier.id": {
        "type": "keyword",
        "copy_to": "identifier"
    },
    "identifier.upstream": {
        "type": "keyword",
        "copy_to": "identifier"
    },
    "document": {
        "type": "alias",
        "path": "document.content"
    },
    "document.engine": {
        "type": "keyword"
    },
    "document.content": {
        "type": "text"
    }
}

But received the following error:

Failed to parse mapping [_doc]: Can't merge a non object mapping [document] with an object mapping [document]

glenacota · March 4, 2019, 4:39pm

Hi,
you are already defining a mapping for identifier and document, and that is an object datatype. You should consider using copy_to, but using a different field name than the ones already defined in the mapping. What about

...
"identifier": {
  "type": "object",
  "properties": {
    "id": {
      "type": "keyword"
      "copy_to": "identifier.*"
    },
    "upstream": {
      "type": "keyword"
      "copy_to": "identifier.*"
    }
  }
},
...

That you can query this way


GET your_index/_search
{
  "query": {
    "match": {
      "identifier.*": "the_id"
    }
  }
}

?

sdavids · March 4, 2019, 5:08pm

So, the primary reason why I am doing this is allow users to perform logical fielded searches rather than having to know all of the various subfields (though it would be available to them if they really needed it). I am afraid that if I ask users to user to run their identifier searches via "identifier.*" it is leaking too many details of the underlying indexing scheme which I would like to avoid.

I wanted to get clarification on if this is possible in Elasticsearch or not (it seems like it is not) -- though this seems to be a limitation on Elasticsearch itself and not Lucene as the underlying Lucene index structure can accommodate a "root" field index type.

Since this request is primarily focused around abstracting the underlying index structure from the user (via query_string queries), another option would be to dynamically map fields at query time. Is there a way to indicate to the Elasticsearch query parser to map query fields of "document" to "document.content" or "identifier" to "identifier.*"?

glenacota · March 4, 2019, 5:34pm

I'm not sure about what information is leaking, as you'll expose the underlying structure anyway by allowing queries on identifier.id and identifier.upstream.

sdavids · March 4, 2019, 5:42pm

Sure, if we documented the fact that those fields are available then you are correct we would be leaking the information. Now, if we say the only "supported" field that they could search on is simply "identifier" then that provides us flexibility in the future while also not exposing that those underlying fields are actually there.

system · April 1, 2019, 5:42pm

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Same field name and mapping but different "index" options? Elasticsearch	3	1244	June 30, 2017
Alias Datatype with multi-fields value Elasticsearch	1	510	November 7, 2019
Mapping template field with keyword suffix Elasticsearch	3	558	May 25, 2021
Setting type 'array' as object type Elasticsearch	2	360	July 6, 2017
Problem with dynamic-mapping.json Elasticsearch	15	892	July 6, 2017

Ability to configure "root" field for an object type

Related topics