How can I debug what influence the write-in speed from logstash:input:file to es?

Hi

I found it very slow when writing to es.
And I've made a test

input {
   file  {
      path => "access.log"
   }
}
output {
    elasticsearch {
        index => "ur_test_%{+YYYY.MM.dd}"
        
        document_type => "k_gif"
        flush_size => 1000
        idle_flush_time => 3
        hosts => ["10.10.22.13:9200"]
    }
    
}

# my server memory is limited, so I set the workers to 4
./logstash -f test.conf -w 4

The raw log write-in speed is near 110 lines per second. But the es index increase about only 50 docs per 3 seconds.

What versions of Logstash and Elasticsearch are you using? What is the specification of the hosts running Logstash and Elasticsearch?

Hi @Christian_Dahlqvist

I am using logstash-2.3.1 and es-2.3.2. The hosts are in the same datacenter, but different vlans, from 10.10.20.0/255.255.255.0 to 10.10.22.0/255.255.255.0.

I suppose it's not network problem.

How much CPU and RAM do the hosts have? What type of storage are you using? How are you measuring indexing throughput?

The Logstash host have 16G memory and 16 cores AMD 6128, but the free memory is about only 3G.

I'm not clear with the es cluster's hardware. But there are many other logs writing to es, it is the cluster operation engineer told me that my log write-in speed was much slower than others.

I would recommend upgrading to Logstash 2.4 and try increasing the number of elasticsearch output workers to 4 (the same number as your number of worker threads).

If you do not have access to the Elasticsearch cluster to measure throughput, you can try introducing the metrics filter and write this to e.g. a file using conditionals.

This is the test.conf

input {
      generator {
        type => "generated"
      }
    }

    filter {
      if [type] == "generated" {
        metrics {
          meter => "events"
          add_tag => "metric"
        }
      }
    }

    output {
      
      if "metric" in [tags] {
        stdout {
          codec => line {
            format => "rate: %{[events][rate_1m]}"
          }
        }
      }

    elasticsearch {
          index => "ur_test_%{+YYYY.MM.dd}"
          document_type => "kkk"
          hosts => ["10.10.22.13:8200"]
          workers => 16
        }
    }

If I test without output to es, the rate is about 20000/min

$ ./logstash -f conf/test.conf 
Settings: Default pipeline workers: 16
Pipeline main started
rate: 0.0
rate: 18273.2
rate: 18615.553825440165
rate: 19008.827059558338
rate: 19918.175759533293
rate: 20440.175732344997
rate: 21039.14095380837
rate: 21438.843646149202
rate: 21849.22019386933
rate: 22468.25071231354
rate: 22695.56038678966
rate: 22992.998331666

If writing to es, the rate decreased deeply to 300/min

$ ./logstash -f conf/test.conf
Settings: Default pipeline workers: 16
Pipeline main started
rate: 399.4
rate: 383.45685627708707
rate: 364.7906766746546
rate: 351.6147416575027
rate: 331.4967377003855

And I've checked the network

$ ping 10.10.22.13
PING 10.10.22.13 (10.10.22.13) 56(84) bytes of data.
64 bytes from 10.10.22.13: icmp_seq=1 ttl=63 time=0.133 ms
64 bytes from 10.10.22.13: icmp_seq=2 ttl=63 time=0.104 ms
64 bytes from 10.10.22.13: icmp_seq=3 ttl=63 time=0.123 ms
64 bytes from 10.10.22.13: icmp_seq=4 ttl=63 time=0.107 ms
64 bytes from 10.10.22.13: icmp_seq=5 ttl=63 time=0.099 ms
64 bytes from 10.10.22.13: icmp_seq=6 ttl=63 time=0.232 ms
64 bytes from 10.10.22.13: icmp_seq=7 ttl=63 time=0.121 ms
64 bytes from 10.10.22.13: icmp_seq=8 ttl=63 time=0.201 ms
64 bytes from 10.10.22.13: icmp_seq=9 ttl=63 time=0.100 ms
64 bytes from 10.10.22.13: icmp_seq=10 ttl=63 time=0.105 ms
64 bytes from 10.10.22.13: icmp_seq=11 ttl=63 time=0.324 ms

And the cluster op engineer have told me the es cluster's average rate is 8000/sec

I believe the metrics shown are number of documents processed per second measured over a period of 1 minute, not documents per minute. Based on the information you have provided it would seem it is possible that Elasticsearch is the bottleneck. The fact that it is processing 8000 events/second does not necessarily mean it has any spare capacity to process additional events.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.