Elasticsearch 5.1.1 remote reindex process aborts without any error

hailima · January 19, 2017, 9:24pm

Hi All,

I migrate my index from ES 1.4.4 to 5.1.1 using "_reindex" API. The reindex process always aborts without any errors before finishing all records. Here are details:

Create a new index with settings in 5.1.1 ES server:

(PUT) http://hostname5_1_1:9200/nexttextindex
{
  "settings": {
    "index": {
      "number_of_replicas": "0",
      "number_of_shards": "5",
      "refresh_interval": "-1"
    }
  }
}

(POST) http://hostname5_1_1:9200/_reindex?wait_for_completion=false
{
  "source": {
    "remote": {
      "host": "http://hostname1_4_4.com:9200"
    },
    "index": "nexttextindex",
    "size": 10
  },
  "dest": {
    "index": "nexttextindex"
  }
}

Use task API to monitor the reindex process:

(GET) http://hotname5_1_1:9200/_tasks?detailed=true&actions=*reindex

{
  "nodes": {
    "H_cHXZN9SnqbdBNaew5wew": {
      "name": "hostname5_1_1",
      "transport_address": "10.169.167.203:9300",
      "host": "10.169.167.203",
      "ip": "10.169.167.203:9300",
      "roles": [
        "master",
        "data",
        "ingest"
      ],
      "tasks": {
        "H_cHXZN9SnqbdBNaew5wew:198701": {
          "node": "H_cHXZN9SnqbdBNaew5wew",
          "id": 198701,
          "type": "transport",
          "action": "indices:data/write/reindex",
          "status": {
            "total": 5653044,
            "updated": 0,
            "created": 43350,
            "deleted": 0,
            "batches": 6,
            "version_conflicts": 0,
            "noops": 0,
            "retries": {
              "bulk": 0,
              "search": 0
            },
            "throttled_millis": 0,
            "requests_per_second": 0,
            "throttled_until_millis": 0
          },
          "description": "",
          "start_time_in_millis": 1484860595578,
          "running_time_in_nanos": 11292914938,
          "cancellable": true
        }
      }
    }
  }
}

As you can see from 3), everything looks good. But, After a couple of hours, the task is done and only about 60K records got indexed in ES 5.1.1 server instead of 5653044. Repeated the process a few times, it always aborted without any errors.

Appreciate if any help!

nik9000 · January 19, 2017, 9:45pm

You should be able to get the status of those reindex tasks with GET /_tasks/<taskId> where the <taskId> is whatever id was returned when you started. If you didn't store them then you should be able to look around with something like GET .tasks/_search. Those should contain the failure reason. Or, if it thinks it finished successfully, it should show you that.

hailima · January 19, 2017, 10:01pm

Thanks for quick reply, Nick

I used "GET /_tasks/", But, it's not working. Here details:

GET .. 9200/_tasks/198701

{
"error": {
"root_cause": [
{
"type": "illegal_argument_exception",
"reason": "malformed task id 198701"
}
],
"type": "illegal_argument_exception",
"reason": "malformed task id 198701"
},
"status": 400
}

nik9000 · January 19, 2017, 10:14pm

You'll need the part before the : in the task id as well. If you don't have it then try the search I sent.

You should wrap your code blocks in ``` so they are readable.

hailima · January 19, 2017, 10:15pm

Ok, command "..9200/.tasks/_search" is good one which gives me error info:

Error:

"type": "process_cluster_event_timeout_exception",
"reason": "failed to process cluster event (put-mapping) within 30s"
},
"status": 503```

Any way to change the timeout (30s) to longer time? I have many different doc types inside the index and under each doc type, i have up to more than 1k docs.

2) If any error occurs during the reindexing process, is there any way to ignore the error and continue the process ? I have a few M records, it's time-consuming if restarting it again. It's Ok if I lose some data in the reindex process

nik9000 · January 19, 2017, 10:30pm

Probably but I don't know if off hand. I think your are better off manually creating the mapping before running reindex.

No. The isn't a place to store the errors so we never implemented this.

If you want some protection against this maybe try reindexing in chunks?

hailima · January 19, 2017, 10:30pm

tried followings with tasks API

/_tasks/210411
/_tasks/210411:1

no of them is working, But, does not brother me since .tasks/_search is good.

The doc for this part is not clear. Where can I find out the descriptions for the status like 404 or 503? Thanks

hailima · January 19, 2017, 10:42pm

We don't need to store the errors and just continue to reindex next records without aborting due to last failure ... Any setting to make it happen? What do you mean by "reindexing in chunks"? we are using "size" for batching, right?

nik9000 · January 19, 2017, 10:59pm

There isn't.

Use a query to limit what you are reindexing to certain days or namespaces or something. Whatever natural division your data has. Then do it again and again until you migrate all the data.

hailima · January 20, 2017, 12:24am

thanks and it's very helpful! Is there anyway to clean up task errors under the command .tasks/_search?

nik9000 · January 20, 2017, 2:00am

You can and should delete them when you are done with them. You can use delete-by-query or delete each one when you know you are done with it.

hailima · January 23, 2017, 10:46pm

ok and thanks, If i don't delete them, are they expired automatically? if yes, when are they expired?

hailima · January 24, 2017, 5:00pm

Hi Nik,

The task cancel API is not working for ES 5.1.1

Here are ones I tried based reindex doc

(POST) /_tasks/1241809:1/_cancel
(POST) /_tasks/1241809/_cancel

any idea ?

thanks

flow_state · January 24, 2017, 7:36pm

We are currently over a month behind on a migration because we have to babysit each and every reindex because of silent failures. We've resorted to setting the logger.root to "Debug". Good luck!

hailima · January 24, 2017, 11:21pm

I wrapped the codes inside xxxxx . It's not working as expected

system · February 21, 2017, 11:21pm

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Reindex from remote api Elasticsearch	1	772	July 1, 2018
Remote reindex from 1.3.9 to 5.5 Elasticsearch	5	749	October 13, 2017
5.3.0 breaks _reindex Elasticsearch	5	1334	April 26, 2017
Elasticsearch crashed during reindex, can I continue? Elasticsearch	2	1397	April 3, 2019
Reindex from Remote 1.7 to 5.5 Transport Exception Elasticsearch	1	560	October 17, 2017

Elasticsearch 5.1.1 remote reindex process aborts without any error

Related topics