Explicit Mapping Vs Dynamic Mapping

Recently, I need to build a new project to replace the old one which is based on Lucene.
But in my experiment, I have found that there is a serious restriction about updating field mapping. From this address (https://www.elastic.co/guide/en/elasticsearch/reference/6.3/indices-put-mapping.html), I see that I cannot add a new filed mapping to the existing index mapping, for example, I have a text filed called "title", but I can't append another text filed, such as "summary". The only way is to reindex the current documents.
But when I used the dynamic mapping function in Elasticsearch, it seems different. First, I create a document, which has only one field, "title", afterward, I create another document that has title and summary two filed. Now the mapping of the index has included two mapping fields.
Here are my questions:

  1. Is there any difference between the strict mapping declaration and dynamic mapping? Why I cannot append a new type filed to the existing index mapping, I wonder that it seems no conflict in appending operation.
  2. If I may append the new fields, but not modify the existing fields in a foreseeable future, what is the best way to achieve it? Since there are about 10 million documents in each index, reindex may be a heavy operation. Should I close the dynamic mapping function in the production environment, is this a more common usage?
    Thanks.

have a text filed called "title", but I can't append another text filed, such as "summary".

You can update the mapping and add a new field without any problem.

The only way is to reindex the current documents.

For sure you need to provide a content for your field in any case.

This script shows that:

DELETE test
PUT test/_doc/1
{
  "title": "foo"
}
PUT test/_doc/1
{
  "title": "foo",
  "summary": "bar"
}

Since there are about 10 million documents in each index, reindex may be a heavy operation.

In my experience most of the time you will spend will come from the source of your data. Like if you have a relational database with tons of relations, it will take quite some time to do all the joins to create an object again.

My advice is to reindex in a clean new index (do not update the same index). Use an alias. At the end of the "reindex" operation, just switch the alias to the new index and remove the old index.

That's a common practice.

Sorry, it seems that I don't make it clear.
I have closed the dynamic mapping function (I have no ideas that if I should close this function in the production environment), so when I want to define a new index mapping relation or append a new type field, the only way is to use the mapping API to perform this, For example:

// add title field
PUT posts
{
  "mappings": {
    "doc" : {
      "properties": {
        "title" : {
          "type": "text"
        }
      }
    }
  }
}

// append summary field
PUT posts
{
  "mappings": {
    "doc" : {
      "properties": {
        "summary" : {
          "type": "text"
        }
      }
    }
  }
}

After the secondary request, there is an error response:

{
  "error": {
    "root_cause": [
      {
        "type": "resource_already_exists_exception",
        "reason": "index [posts/dD-8N9wHSe2xvXTFBuLcAQ] already exists",
        "index_uuid": "dD-8N9wHSe2xvXTFBuLcAQ",
        "index": "posts"
      }
    ],
    "type": "resource_already_exists_exception",
    "reason": "index [posts/dD-8N9wHSe2xvXTFBuLcAQ] already exists",
    "index_uuid": "dD-8N9wHSe2xvXTFBuLcAQ",
    "index": "posts"
  },
  "status": 400
}

At last, should I open the dynamic mapping function? Is there any danger or disadvantage I should consider about beyond its convenience?
Thanks.

The second call to PUT posts can not work as you are telling elasticsearch that you want to create a new index here.

Have a look at: https://www.elastic.co/guide/en/elasticsearch/reference/6.3/indices-put-mapping.html#updating-field-mappings

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.