We're investigating using Elasticsearch as our index searching client so I've set up the service on my local workstation for testing. We'll need to be using the Bulk API for uploading as we'll be dealing with very large amounts of data into this index (50-500TB) eventually. I'm able to do bulk uploading through the NEST client with small amounts of data (10-20 items), but when moving up to a larger dataset (~500,000 items) it fails with the error in the title. Should I look into batching? Is there a maximum size for batches by default? I haven't done any tuning or changed any default settings since installation.
Here is the DebugInformation from the response:
Unsuccessful low level call on POST: /_bulk
# Audit trail of this API call:
- [1] BadRequest: Node: http://localhost:9200/ Took: 00:00:23.4518933
# OriginalException: System.Net.Http.HttpRequestException: Error while copying content to a stream. ---> System.IO.IOException: Unable to read data from the transport connection: An established connection was aborted by the software in your host machine. ---> System.Net.Sockets.SocketException: An established connection was aborted by the software in your host machine
--- End of inner exception stack trace ---
at System.Net.Http.HttpConnection.WriteAsync(ReadOnlyMemory`1 source)
at System.IO.Stream.CopyToAsyncInternal(Stream destination, Int32 bufferSize, CancellationToken cancellationToken)
at System.Net.Http.HttpContent.CopyToAsyncCore(ValueTask copyTask)
--- End of inner exception stack trace ---
at System.Net.Http.HttpContent.CopyToAsyncCore(ValueTask copyTask)
at System.Net.Http.HttpConnection.SendRequestContentAsync(HttpRequestMessage request, HttpContentWriteStream stream, CancellationToken cancellationToken)
at System.Net.Http.HttpConnection.SendAsyncCore(HttpRequestMessage request, CancellationToken cancellationToken)
at System.Net.Http.HttpConnectionPool.SendWithNtConnectionAuthAsync(HttpConnection connection, HttpRequestMessage request, Boolean doRequestAuth, CancellationToken cancellationToken)
at System.Net.Http.HttpConnectionPool.SendWithRetryAsync(HttpRequestMessage request, Boolean doRequestAuth, CancellationToken cancellationToken)
at System.Net.Http.RedirectHandler.SendAsync(HttpRequestMessage request, CancellationToken cancellationToken)
at System.Net.Http.DiagnosticsHandler.SendAsync(HttpRequestMessage request, CancellationToken cancellationToken)
at System.Net.Http.HttpClient.FinishSendAsyncBuffered(Task`1 sendTask, HttpRequestMessage request, CancellationTokenSource cts, Boolean disposeCts)
at Elasticsearch.Net.HttpConnection.Request[TResponse](RequestData requestData)
# Request:
<Request stream not captured or already read to completion by serializer. Set DisableDirectStreaming() on ConnectionSettings to force it to be set on the response.>
# Response:
<Response stream not captured or already read to completion by serializer. Set DisableDirectStreaming() on ConnectionSettings to force it to be set on the response.>
I don't see anything in the logs indicating a failure either. This is the last line. Before that was just the deletion message when I deleted the index and recreated it.
[2019-02-21T15:16:52,881][INFO ][o.e.c.m.MetaDataCreateIndexService] [NODE1] [messages-content-bulk] creating index, cause [api], templates [], shards [5]/[1], mappings [elasticmsg]
Any ideas for what could be causing this?