Help reindexing one domain to another

ViniciusSPaiva · August 16, 2021, 8:42pm

Hello,

I am trying to reindex one domain that has almost 10 million documents to another. The reason I am doing that is that I made a change in the analyzer, to split the dots, ignore leading zeroes and so forth.

I'm getting beaten HARD by the _reindex API. As I write that, I am executing the fourth tentative so far. The problem is that I keep getting this content_too_long_exception and keep decreasing the batch size.

The command I am running is:

/_reindex?pretty=true&scroll=10h&wait_for_completion=false
{
"source": {
"remote": {
"host": "xxxxxxxx",
"socket_timeout": "60m"
},
"index": "fulltext",
"size": 10
},
"dest": {
"index": "fulltext"
}
}

I started running with the batch size of 1000, then 500, then 100. Now I got angry and tried 10, but I think every time I do that the total time increases because of the overhead.
When the task fails isn't there a way to continue from where it stopped? I don't have a date field in my index.
I don't know what the scroll parameter does
I keep controlling the status with the GET /_tasks API. But right now is apparently stuck. This is so slow, I am getting desperate.

Any help would be immensely appreciated

warkolm · August 16, 2021, 10:02pm

It'd be helpful if you could please show the entire error response you are getting.

ViniciusSPaiva · August 17, 2021, 2:37am

"error": {
"type": "illegal_argument_exception",
"reason": "Remote responded with a chunk that was too large. Use a smaller batch size.",
"caused_by": {
"type": "content_too_long_exception",
"reason": "entity content is too long [111951605] for the configured buffer limit [104857600]"
}
}

ViniciusSPaiva · August 17, 2021, 11:39am

UPDATE:

After trying to reindex four or five times, it got extremely slow, like it is being throttled (I didn't use throttle configuration).

I don't have any clue why, so now I deleting my "destination" domain and creating another.

system · September 14, 2021, 11:39am

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Reindex throws error with batch_size as low as 5 Elasticsearch	1	396	October 22, 2019
When we reindex from ES2 to ES6, get error :Remote responded with a chunk that was too large. Use a smaller batch size Elasticsearch	1	996	September 27, 2018
Reindex stops stops after 1000 docs Elasticsearch reindex	3	89	February 20, 2025
Problem when reindexing large index Elasticsearch	5	6770	February 26, 2018
Reindexing a document of size > 100mb Elasticsearch	3	2779	November 14, 2019

Help reindexing one domain to another

Related topics