Ingestion performance issues - where to start?

wardallen · August 21, 2020, 4:11am

Hi all,

I'm having ingestion performance issues that I haven't gotten to the bottom of, and I'm quite new to the elastic stack, so I thought I'd seek advice here.

I have a cluster of 3 VMs (4 CPU/64GB RAM/500GB disk). RHEL 7.

Elasticsearch 7.8.0 is installed on all of them, and configured in a cluster (transport encrypted, http not encrypted). 26GB heap size, usually around 50% utilised

The index is in 3 shards with 2 copies (high availability was a priority)

Logstash 7.8.0 is also installed on every box, output pointing at elastic on the same box. 4GB heap size, 8 workers, 125 batch size.

Logs are round-robined through a VIP to each of the boxes.

Filter configured in logstash to use csv to pull out 66 fields

Data is on average ~400 bytes/event, the sources send it through at approx. 160Mb/s.

I’m finding that this system cannot keep up, logs buffers are building up on the data source devices. When logstash is turned on, there are TCP window resize requests (to a few hundred bytes) arriving at the data source. However, I find nothing saying anything about having to throttle incoming data in the logs. Logstash is ingesting at approx 40Mb/s.

When I turn logstash off and listen with ncat dumping straight to /dev/null, this backlog disappears because throughput skyrockets.

.

The CPU usage hangs around 70%,.

I have tried reducing the 66 fields to 8, which reduced ingest time by about a third in my lab.

Is there anywhere in this that jumps out that should be done differently? Do I have enough hardware?

Christian_Dahlqvist · August 21, 2020, 4:17am

The first thing to do is to try and determine whether Elasticsearch is the bottleneck. I often start looking at storage as indexing can be very I/O intensive and it sounds like CPU and heap usage are OK. Are you using local SSDs for storage? If not, what does iostat -x give on the nodes during indexing?

wardallen · August 21, 2020, 4:33am

Thanks, Christian. I'm not using SSDs.

# iostat -x
Linux 3.10.0-957.el7.x86_64 ()        08/21/2020    _x86_64_ (4 CPU)

avg-cpu:  %user   %nice %system %iowait  %steal   %idle
          27.14    6.88    2.71    9.78    0.00   53.49

Device:         rrqm/s   wrqm/s     r/s     w/s    rkB/s    wkB/s avgrq-sz avgqu-sz   await r_await w_await  svctm  %util
sdb               0.00   460.67    0.00  311.51     0.00 12809.37    82.24     0.84    2.78    2.96    2.78   1.23  38.34

stephenb · August 21, 2020, 5:42am

Have you perhaps considered running logstash on separate different VMs from elasticsearch, not that you can't but they will be competing for resources. Maybe I am not understanding your architecture.

And as Christian asked, what kind of storage are you using?

Also HW wise even though CPU says 70% 4 CPU for 64GB RAM plus Logstash feels light too me. You are also writing 3 copies of the data.

Steve_Mushero · August 21, 2020, 8:00am

Agreed, as I bet it's Logstash as that high CPU might mean you are maxing out on a thread or two that's doing the work, likely in LS, though possible in ES - ES queues may show if it's the bottleneck; IO looks very good - I bet the CSV processing and 66 fields is tying up LS too much.

GREAT up front description, by the way, listing all the nodes, RAM heaps, data rates & sizes, and so on.

stephenb · August 21, 2020, 1:35pm

One minor additional comment dissect filter is much more performant than csv filter in logstash.

system · September 18, 2020, 1:35pm

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Huge concurrent data ingestion to ElasticSearch Elasticsearch	16	2908	September 18, 2018
Elasticsearch/Logstash low indexing rate Elasticsearch	6	487	July 13, 2020
Performance Issues during data-ingestion Elasticsearch	4	1738	January 6, 2021
Logstash ingestion performance for log management Logstash	7	5003	March 9, 2020
Bottleneck while inputting data into the elasticsearch Logstash	7	3343	December 29, 2016

Ingestion performance issues - where to start?

Related topics