According to the Coralogix they are going to drop support of their Coralogix/Ruby based plugin.
As the replacement it is proposed to use http output plugin. You can find the relevant details by navigating the URL below: Logstash - Coralogix
The behaviour of the system before integration: The messages are supplied to the Kafka topic with 50 partitions (input). As the output was used the Coralogix/Ruby plugin and in was able to handle the workload with the 10 instances of the same pipeline which used the Coralogix/Ruby based plugin.
The behavior of the system after the integration: 1.The successful case: Logstash pipeline can handle the workload of the production from the Kafka topic with 50 partitions with 20 pipelines (instead of 10 with Coralogix/Ruby) which in turn significantly increases the expences for this part of the infrastructure (ec2 c2.8xlarge - 992$ per month).
The solutuin works but its price is 2 times higher than the original one (Coralogix/Ruby) You can find the sample of the output configuration below.
I would like also to mention that for this setup the value of the pool_max = 50 by default
output {
http {
url => "${CORALOGIX_MAIN_API_URL}"
http_method => "post"
headers => ["private_key", "${CORALOGIX_MAIN_PRIVKEY}"]
format => "json_batch"
codec => "json"
mapping => {
"applicationName" => "${ENVIRONMENT}"
"subsystemName" => "${SUBSYSTEM}"
"text" => "%%{[@metadata][event]}"
}
http_compression => true
automatic_retries => 5
retry_non_idempotent => false
connect_timeout => 60
keepalive => true
pool_max => 50
}
}
Attempt to optimise the expences In attempt to optimise the expences (decrease a number of pipelines from 20) I tried to increase the pool_max value to 300 during a set of probes and managed to handle the workload with a noticable lags for 4 or 5 hours, after this period I got a high Lag of non consumed messages and had to stop the test (13 pipelines were not able to handle the workload)
output {
http {
url => "${CORALOGIX_MAIN_API_URL}"
http_method => "post"
headers => ["private_key", "${CORALOGIX_MAIN_PRIVKEY}"]
format => "json_batch"
codec => "json"
mapping => {
"applicationName" => "${ENVIRONMENT}"
"subsystemName" => "${SUBSYSTEM}"
"text" => "%%{[@metadata][event]}"
}
http_compression => true
automatic_retries => 5
retry_non_idempotent => false
connect_timeout => 60
keepalive => true
pool_max => 300
}
}
Question: Is this expected behavior for the http plugin? I mean the high number of logstash pipeline instances, supposedly equal or less equal to the number of the Kafka topic partitions? In my case Kafka topic partitions = 50 and 40 pipelines were able to cover the load instead of the 20 with Coralogix/Ruby. Is there any other/additional approach that could decrease a cost of this solution and a number of the logstash pipelines that handle the worload?
Thank you.