Connection errors during indexing using goroutines

apo_p9 · March 30, 2022, 7:24am

I have a syncing utility in Go, that moves data from MongoDB to ES. Here is the gist of it. Create workgroup, launch a goroutine listening on a channel, pull results from Mongo and publish to a channel which will index it to ES using goroutines.

wg := &sync.WaitGroup{}
go func() {
	for {
		select {
		case data := <-resultFeed:
			go func() {
                defer wg.Done()
                pushToES(data)
            }()
		}
	}
}()
... pull from Mongo
for {
    wg.Add(1)
    resultFeed <- row
}
...
wg.Wait()

Using the official ES client or olivere, the issue is that when using a goroutine to push the results to ES I get connection errors like:

no available connection: no Elasticsearch node available
read tcp 127.0.0.1:52778->127.0.0.1:9200: read: connection reset by peer
write tcp 127.0.0.1:54947->127.0.0.1:9200: write: broken pipe

Those come intermittently as some results will succeed then some will produce an error.
Both ES and Mongo are default local installations. There are no logs produced from ES.

If I dont use a goroutine there are no errors but its obviously much slower and I chose Go specifically for the ability to index concurrently. I haven't tried using bulk requests because it is not only the indexing thats happening so I'd like to keep it per document.

spinscale · March 30, 2022, 8:01am

I have no idea about your code, so all of the following is just a crazy assumption, which you are free to dismiss.

When using go-routines or going async in general, are you capping the number of concurrent requests to Elasticsearch or do you keep sending new requests without making sure there is a concurrent limit (i.e. only 5 requests at the same time in flight and triggering the 6th one only, if another one returned)?

That could be a problem.

apo_p9 · March 30, 2022, 8:12am

Interesting. There are no limitations right now.
@spinscale Would there be a configuration setting that I can modify to increase the number of requests allowed in ES?

spinscale · March 30, 2022, 8:47am

Doing this on the server side is always an arms race IMO. I'd rather solve this on the client side (and maybe have a simple test first, if this is really the problem, before going deeper). If Elasticsearch is overwhelmed with the number of index operations (but the HTTP connection still works), it will tell you.

Also, make sure to reuse your HTTP connections instead of creating new ones for each request.

apo_p9 · March 30, 2022, 8:52am

@spinscale Thank you for your suggestion, I implemented a buffer and havent seen the errors since.

system · April 27, 2022, 8:52am

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Go functions, ElasticSearch Add and Mutex Elasticsearch	2	634	July 5, 2017
Is elasticsearch client is thread safe? Elasticsearch	3	2514	November 12, 2018
Extremely slow indexing -- java throwing http excetion errors Elasticsearch	2	686	July 6, 2017
Elasticsearch timeout error on operations Elasticsearch	5	1286	August 30, 2021
Go-elasticsearch versus Olivere golang client Elasticsearch language-clients	2	2704	November 14, 2020

Connection errors during indexing using goroutines

Related topics