Number format Exception For string type

Hi,

did you create the new index before running the reindex? Please follow the example below. This worked fine for me on Elasticsearch 2.4.5 (as your talking about "string" I assumed you are on 2.x; in 5.x "string" type has been split into "text" and "keyword").

Let's start fresh and delete all involved indices for this example:

DELETE /old_books
DELETE /new_books

Then create the old index and add a document to it.

PUT /old_books
{
   "mappings": {
      "author": {
         "properties": {
            "name": {
               "type": "string"
            },
            "auid": {
                "type": "string"
            }
         }
      }
   }
}

PUT /old_books/author/1
{
    "name": "Mark Twain",
    "auid": "42"
}

Now we realize we made a mistake and create the new index where we correct the mapping of the auid field to integer:

PUT /new_books
{
   "mappings": {
      "author": {
         "properties": {
            "name": {
               "type": "string"
            },
            "auid": {
                "type": "integer"
            }
         }
      }
   }
}

and let the reindex API do its job:

POST /_reindex
{
  "source": {
    "index": "old_books"
  },
  "dest": {
    "index": "new_books"
  }
}

This worked fine for me. Here is the response:

{
   "took": 56,
   "timed_out": false,
   "total": 1,
   "updated": 0,
   "created": 1,
   "batches": 1,
   "version_conflicts": 0,
   "noops": 0,
   "retries": 0,
   "throttled_millis": 0,
   "requests_per_second": "unlimited",
   "throttled_until_millis": 0,
   "failures": []
}

Note, although Elasticsearch indexes the field now as integer, it is still a string in _source:

GET /new_books/author/_search
{
    "query": {
        "match_all": {}
    }
}

shows:

{
  ...
      "hits": [
         {
            "_index": "new_books",
            "_type": "author",
            "_id": "1",
            "_score": 1,
            "_source": {
               "name": "Mark Twain",
               "auid": "42"
            }
         }
      ]
...

If you do not want that you can use an inline script to convert the source as well during reindexing. Remember that inline scripts are a security risk and that you should rather use stored scripts for this. If you still want to use inline scripts then you need to (temporarily) enable them by setting script.inline: true in config/elasticsearch.yml and restart your cluster:

POST /_reindex
{
  "source": {
    "index": "old_books"
  },
  "dest": {
    "index": "new_books"
  },
"script": {
    "inline": "ctx._source.auid = Integer.valueOf(ctx._source.auid)"
  }
}

This will significantly slow-down the reindexing process though. Also check how you can reindex without downtime.

Daniel