Reindex does not honor http.max_content_length: "500m",


(Cynosureabu) #1

Reindex from 2.4 to 6.5, batch size is 200
http.max_content_length: "500m",
but still receive exception:

{"error":{"root_cause":[{"type":"illegal_argument_exception","reason":"Remote responded with a chunk that was too large. Use a smaller batch size."}],"type":"illegal_argument_exception","reason":"Remote responded with a chunk that was too large. Use a smaller batch size.","caused_by":{"type":"content_too_long_exception","reason":"entity content is too long [114048163] for the configured buffer limit [104857600]"}},"status":400}

Any other config i could use? or its a bug?


(Nik Everett) #2

I believe we intentionally did not make this configurable though I don't have much memory of what me-from-two-years-ago was thinking. The http.max_content_length setting is about the server's HTTP implementation while the reindex-from-remote setting isn't really the same thing.

That 100mb limit mostly exists to prevent taking up a ton of memory during the process. That 100mb is copied a few times by reindex-from-remote because it has to build index requests and shuffle them off to the appropriate place.

Generally we recommend using a smaller batch size or skipping the large documents explicitly. I'm aware this isn't super friendly though.


(Cynosureabu) #3

Got it.
IMHO ,this should become configurable. As when I am doing reindex, I mostly try to copy data to my new cluster. Our machines has 64G memory, so a default 100M is really too small.