It is importantant that you can search for parts of words, so I'm setting up nGram-settings. It works nicely when I have min and max to 4, but when I tested to raise max to 6 I got a problem.
The problem occours when I run the initial indexing of all existing content. It is emails of various sizes and content. I take them in batches of 100 and send them for indexing, and after 500 (5 batches that is) I get:
Maximum timeout reached while retrying request. Call: Status code unknown from: POST /_bulk
Then the same on the next batch again. And then on 700 I get:
The remote server returned an error: (429) Too Many Requests.. Call: Status code 429 from: POST /_bulk. ServerError: Type: circuit_breaking_exception Reason: "[parent] Data too large, data for [<http_request>] would be [1018201304/971mb], which is larger than the limit of [986932838/941.2mb], real usage: [896683280/855.1mb], new bytes reserved: [121518024/115.8mb]"
It feels like something is running out of memory and that I send in too much at a time.
I have changed the batch size to 50 but I get the same error at the same place.
Any ideas?