Cannot set _id from _source during reindex

I am trying to reindex an existing index into a new one, while modifying the document _id in the process. My new _id needs to be populated from one of the documents' fields.

I am using the following reindex request

POST /_reindex?wait_for_completion=false

{
    "source": {
        "index": "text_index_1",
        "query": {
		"bool": {
			"filter": [
				{
               "terms":{
                  "system.www_id":[
                     "12341234"
                  ]
               }
            }
			]
		}
	}
  },
    "dest": {
        "index": "text_index_2"
  },
  "script": {
    "lang": "painless",
    "inline": "ctx._id = ctx._source['system.www_id']"
  }
}

but the _id in the new index is a random ElasticSearch-generated id, which means that the script fails to pick up the value

Here is my full setup:.

First I create 2 indexes with identical mappings

PUT /text_index_1

{
    "settings": {
        "index": {
            "number_of_shards": 1,
            "hidden": false,
            "refresh_interval": "60s",
            "shard": {
                "check_on_startup": "checksum"
            }
        }
    },
    "mappings": {
        "dynamic": "true",
        "properties": {            
            "system": {
                "dynamic": "true",                    
                "properties": {
                    "www_id": {
                            "type": "keyword"
                        }
                }
            }
        }
    }
}

and create text_index_2 with the same mappings.

Next, I create a document on test_index_1

PUT /text_index_1/_doc/1234

{    
    "system": { 
      "www_id": "12341234"      
    }
}

and then I reindex using the reindex request I posted in the beginning.

When I do

GET /text_index_2/_search/

{
   "from": 0,
   "size": 10,
   "query":{
      "bool":{         
         "filter":[
            {
               "terms":{
                  "system.www_id":[
                     "12341234"
                  ]
               }
            }
         ]
      }
   }
}

I get

        "hits": [
            {
                "_index": "text_index_2",
                "_type": "_doc",
                "_id": "gUys0oEBhQFL4Aplz7O_",
                "_score": 0.0,
                "_source": {
                    "system": {
                        "www_id": "12341234"
                    }
                }
            }
        ]

To figure out what is going on, I modified the script to

ctx._id = ctx._source['system.www_id'] + ''

and the _id in the reindexed document was "null" , which is a pretty good indication that it fails to pick up the system.www_id field's value.

Any idea what is going on?

Answering my own question here.

The proper syntax on the script is:

ctx._id = ctx._source.system.www_id

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.