Mapping of dest index overwritten upon reindexing

ETspielberg · March 19, 2020, 3:53pm

Hi,

I am using elasticsearch 7.6.1 and I have indexed a number of documents with a type, which was generated automatically. For example I have timestamps, which are currently indexed as text. In addition, I need to adjust the analyzers as some of the fields contain special characters. I would like to reindex these fields as dates, or adjust the analyzer to be of keyword-type.

I created a new index with the appropriate field types and tried to use the reindex API to convert the data into the new mapping. However, as soon as I have executed the reindex, the mapping of the dest index has the same fields as the mapping of the source index.

Is there a way to preserve the mapping of the dest index and to try to convert the fields into the types of the new index (e.g. from timestamps as text to dates)?

Thanks in advance!
Cheers,
Eike

spinscale · March 19, 2020, 4:28pm

What you expect is also the behaviour that should happen. You create the index upfront (including the mapping) and the mapping itself will not be modified. Can you share a minimal example to reproduce this behaviour. To me it sounds, as if some component in your tooling is deleting the index after you created it manually.

ETspielberg · March 19, 2020, 5:37pm

Thanks for the reply.

The mappings are rather larger, however, one example for a changing field is the edition of a given book. In the index "book", generated by an external programme, the corresponding part of the source mapping reads (copied from index management in kibana):

 "edition": {
          "type": "text",
          "fields": {
            "keyword": {
              "type": "keyword",
              "ignore_above": 256
            }
          }
        },

I created a new index "book_v1" to adjust the mapping (planning to delete the old initial index and put some alias "book" up front). I created this index from the dev tools of kibana and confirmed in the index management the appropriate mapping (again copied directly from kibana):

"edition": {
              "type": "integer"
            },

I then run

POST _reindex
{
  "source": {
    "index": "book"
  },
  "dest": {
    "index": "book_v1"
  }
}

This command runs well and does not produce any errors:

{
  "took" : 440,
  "timed_out" : false,
  "total" : 44,
  "updated" : 0,
  "created" : 44,
  "deleted" : 0,
  "batches" : 1,
  "version_conflicts" : 0,
  "noops" : 0,
  "retries" : {
    "bulk" : 0,
    "search" : 0
  },
  "throttled_millis" : 0,
  "requests_per_second" : -1.0,
  "throttled_until_millis" : 0,
  "failures" : [ ]
}

However, in the mapping of the book_v1 index I have now (copied from kibana)

"edition": {
          "type": "text",
          "fields": {
            "keyword": {
              "type": "keyword",
              "ignore_above": 256
            }
          }
        },

Similarly, previously defined analyzers for certain fields are not present any more, e.g.:

"doi": {
                  "type": "text",
                  "fields": {
                    "keyword": {
                      "type": "keyword",
                      "ignore_above": 256
                    }
                  },
                  "analyzer": "keyword"
                },

becomes

"doi": {
              "type": "text",
              "fields": {
                "keyword": {
                  "type": "keyword",
                  "ignore_above": 256
                }
              }
            }

As said above, the complete mapping is rather large and contains nested fields. Can this cause problems?

ETspielberg · March 19, 2020, 5:44pm

I forgot some technical details. I use the free versions of elasticsearch, running locally on my Dev-PC:

"version": {
    "number": "7.6.1",
    "build_flavor": "default",
...
 "lucene_version": "8.4.0",

I send the HTTP commands with Postman or from the Kibana Dev Tools (Kibana 7.6.1). I tried both and had the same effect. The original data had been indexed using a spring boot application.

spinscale · March 20, 2020, 8:52am

nested fields should not be a problem either. Most of this looks good to me.

I can only repeat what I said before: without a proper reproduction it will become really hard to help, even if that means some more work on your side. Snippets are not enough and will always mean we're assuming certain details.

Once you created an index, the mapping cannot be changed - which again makes me think there is something deleting your index first. Also, you may want to check for index template, even though that index creation with the proper mapping has precedence (and you saw the correct field when retrieving the index data)

Are you doing this whole process in kibana dev-tools?

ETspielberg · March 20, 2020, 12:42pm

Hi,

I just found the error. And I am very sorry for bothering, but I spent hours figuering it out.

I used a spring boot application to ingest the data and to create the inital (=source) index. Then I copied the mapping from kibana inot my IDE and adjusted the fields and analyzers. Then I created a new index (=dest) by using the PUT command both from Postman as well as from the Kibana Dev Tools (I tried a lot of times...).

However, the two indices differed slightly - there was an additional field "book" in between. and when new data were posted they were created next to it. So it was not replacing the fields, it was merely adding them side by side with the new structure, shifted by one level:

"edition": "..."
"doi": "..."
"book": {
  "edition": null,
  "doi": null,
  ...
}

Due to the quite large mapping structure it was hard to see. But thanks very much for the help!

system · April 17, 2020, 12:43pm

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
How to change the field type in an ElasticSearch Index? Kibana reindex	3	552	January 13, 2021
Change field type, don't need to reindex Elasticsearch	2	22	August 29, 2024
Reindex api in ES 5.x version? Elasticsearch	2	465	February 23, 2017
Reindexing not copying all data Elasticsearch docker	4	1660	June 1, 2020
Reindex api and date field Elasticsearch	6	1712	September 18, 2018

Mapping of dest index overwritten upon reindexing

Related topics