Filebeat to Elasticsearch log shipping is very slow

rajisankar · June 12, 2018, 4:04pm

Hi,

I have found some posts on this before. But, have not got a definitive answer with respect to Beats 6.x as spool_size has been removed.

The following tests have been performed on
Filebeat 6.x
Elasticsearch 6.2.4 with 16GB Heap

Filebeat config with File Output gives 80,000 events/s

output.file:
path: "/opt/CCURfilebeat"
filename: filebeat
number_of_files: 7
permissions: 0600

queue:
mem:
events: 40000
flush.min_events: 20000

Filebeat config with elasticsearch output gives 14,000 events/s
output.elasticsearch:
hosts: ["elastic-server:9200"]
bulk_max_size: 20000
username: "elastic"
password: "elasticpassword"

queue:
mem:
events: 40000
flush.min_events: 20000

The elasticsearch indexing rate is 13,800 events/s and this seems to the bottleneck.

What i dont understand is Elasticsearch CPU Utilization is 10% and JVM Heap Used is 6GB/16GB. Then why is the indexing rate still so low? What other factors should we consider to stress the elasticsearch system?

Any suggestions on improving this performance would be highly appreciated.

Christian_Dahlqvist · June 12, 2018, 4:18pm

Have you optimised Elastichsearch for indexing speed? What type of storage do you have? What is disk I/O and iowait looking like? How many nodes in the cluster?

rajisankar · June 21, 2018, 6:10pm

Hi,

After a brief benchmark, it appears that the problem was with Elasticsearch indexing speed. Thanks, Christian!

For Benchmarking purposes, we have a single node Elasticsearch. This has a single index with one shard and no replicas. We use a 400GB SSD , 32GB RAM in which 16GB is allocated for Heap, 12 Core Processor.
we are using a single thread and a 2GB flush threshold and 30s refresh interval.
Index settings are as follows.

"index.merge.scheduler.max_thread_count" : "1",
"index.translog.flush_threshold_size" : "2gb",
"index.refresh_interval": "30s",
"index.mapping.total_fields.limit":"30000"

i am unable to index more than 60,000 document/s from Filebeat.

Using X-Pack monitoring, the CPU Utilisation is 60%, JVM Heap Utilization is 72% and disk I/O is 130 MB/s. Clearly none of these factors is the bottleneck.

I am not sure what else might be the bottleneck. Is there a way to find what factors might attribute to this?
Thanks in advance.

Christian_Dahlqvist · June 21, 2018, 6:44pm

If you are using dynamic mappings (I am guessing this may be the case based on the number of fields you have specified) and are adding fields as indexing progresses, each change will require the cluster state to get updated, which can slow indexing down.

rajisankar · June 21, 2018, 7:22pm

Sorry, that parameter was not needed. We use static mappings. Any other factors that might affect this performance?

I am wondering why its set at 60,000 documents/s when Elasticsearch can do much more. Not knowing what the bottleneck is bothering me. I am sure i am missing something here.

rajisankar · June 22, 2018, 4:22pm

From Elasticsearch Benchmarking for HTTP Logs at https://elasticsearch-benchmarks.elastic.co/index.html#tracks/http-logs/nightly/30d , the number of documents indexed seems to be 171,000 docs/s for 3-node Elasticsearch.

With one Node elasticsearch, i am able to, 60,000 docs/s. This seems to be fine though. But, is this comparison valid?

Apart from the usual resources, like CPU, Memory, Disk I/O, Network , what other factors could limit elasticsearch performance?

Christian_Dahlqvist · June 22, 2018, 4:28pm

Documents per second is not really a very good measurement of indexing performance as it will depend a lot on the size and complexity of the documents being indexed. You will get a better comparison if you run the same Rally track on your hardware.

rajisankar · June 22, 2018, 6:46pm

Yeah, that makes sense. But, this benchmark is for HTTP logs and i am importing raw logs from a HTTP server as well. Hence, was hoping it would be close enough.

I am still struggling with finding what else could be the bottleneck. Any pointers to that is highly appreciated.

Christian_Dahlqvist · June 25, 2018, 6:57am

The standard HTTP logs track uses very small documents, so it may or may not be comparable. I created a track that simulates events that are a bit larger and probably is closer to what you would get out of Filebeat. We talked about it here and it is available on GitHub.

system · July 23, 2018, 6:58am

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Filebeat only operating at 10%-15% of expected Performance Beats filebeat	4	753	March 8, 2021
Filebeat 6.2 throughput and general performance Beats filebeat	7	4473	April 3, 2018
Filebeat poor performance when publishing existing file to ES Beats filebeat	8	2864	March 7, 2017
About the time spent when output to elasticsearch Beats filebeat	7	807	September 5, 2018
Slow indexing speed, possibly related to filebeat misconfiguration Beats docker , filebeat	5	1019	April 7, 2023

Filebeat to Elasticsearch log shipping is very slow

Related topics