Subsequent bulk index fail after first index

Jtaks · July 24, 2019, 7:08pm

Hi There,

ES: 6.6
elastic/elasticsearch-js: 6.8
GKE: 1.12.7-gke.24
nodejs: 10.16.0

I currently have a Kubernetes cluster on GCP. What I usually do is kubectl port-forward elasticsearch-client... 9200 and test my nodejs script from http://localhost:9200.

After switching from elasticsearch-js:16 to the newer elasticsearch-js:6 and a short hiatus, my scripts no longer work (bear with me... I beleive they did before the hiatus and after the version switch).

My script simply creates an array of alternating action and data values for a bulk request. I then chunk that array (it is to large for one bulk request) and iterate over each chunk, waiting for the previous to finish so as not to overwhelm the ES server. My first chunk successfully posts, but it then hangs on the second post resulting in an error.

const { logger, client } = await initialize()

// * Read in raw data
const attributes = JSON.parse(fs.readFileSync(path.resolve(__dirname, 'data.json')))

// * Format to match mapping and push onto action array
const actions = attributes.map(build)
  .reduce((actions, attribute) => {
    actions.push({ /* action */ }, item)
    return actions
  }, [])

// * Split 30000+ item array into 5000 item chunks
const actionChunks = chunk(actions, 5000)

// * Post each chunk
for (let i = 0; i < actionChunks.length; i++) {
  // * Post and wait for completion
  // ! It hangs here when i === 1
  const result = await client.bulk({ body: actionChunks[i] })

  // * Parse result for errors
  if (result.body.errors) {
    for (let i = 0; i < result.body.items.length; i++) {
      if (result.body.items[i].index.status > 299) {
        const error = result.body.items[i].index.error
        logger.error('error: %o', error)
      }
    }
  }

  // * Print progress
  logger.log(`Process chunk ${i + 1}/${actionChunks.length}`)
}

// * Print complete
logger.log('Complete')

My error is one of the following depending on if gzip compression was enabled on the elasticsearchjs client:

(Enabled) AssertionError [ERR_ASSERTION]: zlib binding closed
(Disabled) Timeout Error

So it would seem that ES is closing the connection for currently unknown reason.

Now I am trying to figure out how I can debug the situation and I am completely lost. GCP logs show nothing after the initial startup logs, es client outputs nothing but the thrown error. I am hoping I will be able to find some help here. Please let me know if you need to see anything else.

Thank you

Edit for more info

My topology is as follows:

3x Master Nodes
1x Data Node
1x Client Node

My data node shows the following regularly, during both successful cronjobs from within GCP and the script above:

[2019-07-24T19:03:36,653][INFO ][o.e.m.j.JvmGcMonitorService] [elasticsearch-data-0] [gc][1787555] overhead, spent [410ms] collecting in the last [1.2s]

Jtaks · July 24, 2019, 9:31pm

Is seems the first request, no matter the action, consumes the connection. For example adding

await client.ping()

after initialize() makes client.bulk hang at i === 0.

Jtaks · July 25, 2019, 8:23pm

This had everything to do with the @elastic/elasticsearch Client sniff features. Enabling sniff finds all nodes in the cluster and removes the old ones that dont match. This means that my localhost connection was being removed on sniff and replaced with internal cluster IPs.

system · August 22, 2019, 8:24pm

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Multiple bulk request hanging Elasticsearch	1	424	July 6, 2017
Cluster hangs for 1h. no logs, no throughput Elasticsearch	7	1272	July 24, 2017
ElasticSearch _bulk calls resulting in "socket hang up" despite small size Elasticsearch	4	4189	July 6, 2017
Facing Bulk queue rejection on index requests Elasticsearch	5	2149	September 7, 2017
Bulk Load Elasticsearch	1	399	August 15, 2018

Subsequent bulk index fail after first index

Related topics