Elasticsearch indexing rate from Logstash

So let me try to provide as much context as possible.

We are using ECE. Both ES and Logstash version 6.2.4.
We have a cluster of 10 nodes, 16GB memory per node.
Everyday we have a process responsible for indexing a bunch of documents (~5M) into ES by sending them to logstash (we use pubsub input / es output).
Our Logstash instance is configured with 16GB of RAM, with -Xms8g -Xmx8g

logstash.yml has the following:
pipeline.workers: 40
pipeline.batch.size: 5000

Finally the logstash pipeline looks like this:

input {
  google_pubsub {
    project_id => "my-project-id"
    topic => "logstash"
    subscription => "logstash"
    max_messages => 5000
  }
}
filter {
  json {
    source => "message"
  }
  mutate {
    remove_field => ["message", "@version", "@timestamp"]
  }
}
output {
  elasticsearch {
    hosts => ["my-es-host"]
    user => "elastic"
    password => "password"
    index => "my-index"
    document_id => "%{[@metadata][uuid]}"
    action => "update"
    scripted_upsert => true
    script => "import"
    script_type => "indexed"
    script_lang => ""
  }
}

So basically we are seeing a fairly low indexing rate. Check out the graph below. On the left side is the rate at which messages are posted to logstash, on the right side are messages remaining to be processed by logstash over time.

58

According to the following monitoring graph, the indexing rate seems to plateau at around 1.5k/s, with a few spikes here and there but they never last.

Finally during this import process, CPU and memory usage are nowhere near capacity.

Which makes me feel like the ingestion rate could potentially be much faster. I'm trying to figure out what the bottleneck is. Any thoughts?

Wondering what you can see in the Logstash Monitoring part...
You should be able to see how LS itself is performing IMO.

I'm moving your question to #logstash

Hi @dadoonet, thanks for the reply.

Here is a comparison of Logstash metrics and Elasticsearch indexing rate when the process is running

06

Those graphs match perfectly which leads me to believe that Logstash is indeed the bottleneck here. Any idea why the Events Received Rate would be so low?
Logstash CPU usage is very low (5%) and heap is also fine.

Hi @dadoonet, I think you can move that back to the ES section.

I was wrong in thinking that Logstash was the issue. I configured my pipeline to send the output to stdout (for testing purpose) and now the LS monitoring looks like this.

Receive Rate is now 5 times higher which makes it obvious that ES is actually my bottleneck here.

After further investigation, I really suspect it must have to do with the fact that I'm using a scripted_upsert

output {
  elasticsearch {
    hosts => ["my-es-host"]
    user => "elastic"
    password => "password"
    index => "my-index"
    document_id => "%{[@metadata][uuid]}"
    action => "update"
    scripted_upsert => true
    script => "import"
    script_type => "indexed"
    script_lang => ""
  }
}

I did some tests with a normal index (not scripted upsert) instead and it was much faster.
Is there a limit on how many scripts ES can execute on a given period?

The script itself looks like this, nothing too funky

POST _scripts/import
{
  "script": {
    "lang": "painless",
    "source": "if (ctx._source.created == null) { ctx._source.created = params.event.get('updated'); }"
  }
}

The reason I'm using a script is to be able to keep track of the created timestamp while doing an upsert.

Would anyone be able to confirm that a scripted_upsert is indeed much slower than an index or even a regular upsert?

1 Like

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.