Help reindexing one domain to another

Hello,

I am trying to reindex one domain that has almost 10 million documents to another. The reason I am doing that is that I made a change in the analyzer, to split the dots, ignore leading zeroes and so forth.

I'm getting beaten HARD by the _reindex API. As I write that, I am executing the fourth tentative so far. The problem is that I keep getting this content_too_long_exception and keep decreasing the batch size.

The command I am running is:

/_reindex?pretty=true&scroll=10h&wait_for_completion=false
{
"source": {
"remote": {
"host": "xxxxxxxx",
"socket_timeout": "60m"
},
"index": "fulltext",
"size": 10
},
"dest": {
"index": "fulltext"
}
}

  • I started running with the batch size of 1000, then 500, then 100. Now I got angry and tried 10, but I think every time I do that the total time increases because of the overhead.
  • When the task fails isn't there a way to continue from where it stopped? I don't have a date field in my index.
  • I don't know what the scroll parameter does
  • I keep controlling the status with the GET /_tasks API. But right now is apparently stuck. This is so slow, I am getting desperate.

Any help would be immensely appreciated

It'd be helpful if you could please show the entire error response you are getting.

"error": {
"type": "illegal_argument_exception",
"reason": "Remote responded with a chunk that was too large. Use a smaller batch size.",
"caused_by": {
"type": "content_too_long_exception",
"reason": "entity content is too long [111951605] for the configured buffer limit [104857600]"
}
}

UPDATE:

After trying to reindex four or five times, it got extremely slow, like it is being throttled (I didn't use throttle configuration).

I don't have any clue why, so now I deleting my "destination" domain and creating another.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.