Need to know ways of controlling the write to storage account for event hub access

I am having azure event hub as input logstash code , please advise how can i controlling events the write to storage account for event Hub Acess ,

this method will help in cost saving. please advise your thoughts

Can you give more context? It is not clear what you want to do and what is your issue.

You want to limit the writes in the storage account that logstash uses while reading from an Event Hub?

Yes , currently writing every 5 seconds from azure event hub input , need to increase 30 or 60 seconds , but i dont know how to do that , where to make changes please help me

There is nothing in the configuration of the input that controls that, I'm afraid.

What you can try to do maybe is to increase the value of max_batch_size to increase the number of events retrieved each time.

Thank you , I have tried to increase checkpoint_interval => 30 in logstash code

from this above default value , i tried to increase 10 or 30 seconds instead of 5 seconds , but still events are writing events to storage account every 5 seconds ,because this is default

my aim here to delay from 5 seconds , writing events default to storage account is 5 seconds ,i wanted to increase 20 or 30 or 60 seconds , every 5 seconds events are written , i want to make delay from 5 seconds to 20 seconds or 60 seconds

help me how can i achieve this in Azureeventhub logstash..

as per your recommendation i have tried to increase batch size from current size 290 to 300 in azure event hub logstash input code , but no improvement , if i increase more than 300 like 320 , log stash not at all receiving events , getting Below error, is there is any maximum batch size limit for logstash?

error:[stopping the reactor because thread was interrupted or the reactor has no more events to process.]

actually i am getting rates below in Kubernetes pod through logstash Pipeline ,

rate: 157.6098245327287
rate: 155.06645139167588
rate: 160.76996702722207
rate: 154.66376160882123
rate: 157.74494910737187
image

every rate writing difference is 5 seconds , then it will show the new rates , i want to make a delay for 30 seconds for New Rate that is New event writing

my logstash code for your reference , its writing events every 5 seconds , i want to make it delay like every 30 seconds , how can i achieve this , help me

input {
azure_event_hubs {
event_hub_connections => ["${CLCEHPRIMARYCONNECTIONSTRING};EntityPath=cef"]
storage_connection => "${CLCSTORAGEPRIMARYENDPOINT}"
storage_container => "offsets-parser-cef"
threads => 33
decorate_events => true
consumer_group => "${CLCEHCONSUMERGROUP}"
codec => "cef"
max_batch_size => 290
output {
if "metric" in [tags] {
stdout {
codec => line {
format => "rate: %{[events][rate_1m]}"
}
}

leandrojmp

hi Leandrojmp , and all looking forward for your earliest reply

any update will be highly appreciated

As I said before

Also this is not an error, it is a warning

stopping the reactor because thread was interrupted or the reactor has no more events to process

It means that there were no more events in your event hub to be consumed and processed.

thank you for your Reply , this error came in as per your recommendation when i increased batch size from 290 to 320 , if i put it back to 300 this error go away immediately , i want to delay getting events from azure event hub in log stash , can you help me how can i achieve this , currently getting events for every 5 seconds , now i need to increase it that is delay , like every event has to be collected through log stash for every 30 seconds , please help me how can i achieve this , i tried increasing checkpoint interval but no luck .

  • With a configured Azure storage account, the restart will be at the last check point which will be every 5 seconds (default) or the end of the last batch of data read.

how can i control this , i want to increase it to 1 minute

This is controlled by the checkpoint_interval value.

You can set it to 60 so the checkpoints will be written every 60 seconds, but checkpoints are also written at the end of every batch processing independent of the value of this setting.

This is in the documentation:

Checkpoints are automatically written at the end of each batch, regardless of this setting.

You do not have full control when Logstash will write the checkpoints.

What you can try is what I already suggested, increase the max_batch_size so Logstash will process more events, you need to increase it and see if the behavior change.

As already said the message below is not an error, it is an warning that says that there are no more events to be processed, Logstash will resume processing when the event hub gets more events.

stopping the reactor because thread was interrupted or the reactor has no more events to process

Increase the max_batch_size, try with 500 or higher and see how it behaves, I don't think that there is anything else that you can do, but as I said, you do not have full control when Logstash will write a Checkpoint in the storage account.

If this is really an issue for you, there is the option to not use an storage account at all, but this depends if you have just one or more logstash consuming from the same event hub.

thank you for your detailed answer , will try this and update you

I have tried to increase checkpoint_interval , but no change in writing events , it still for default 5 seconds , when i increase more than 500 batch size , logstash keep on looping with repeated Below message warning and last info message

[2022-09-13T17:32:42,955][WARN ][com.microsoft.azure.eventhubs.impl.MessagingFactory][main][6ead40b00fbb8cbcd50386a8178c3b903ed50b4e621447eb000fac1b6c3bf4a7] messagingFactory[MessagingFactory8de299], hostName[z-soc-ngs-clc-test-ew1-evh01.servicebus.windows.net], message[stopping the reactor because thread was interrupted or the reactor has no more events to process.]
[2022-09-13T17:32:42,955][INFO ][logstash.inputs.azure.processor][main][6ead40b00fbb8cbcd50386a8178c3b903ed50b4e621447eb000fac1b6c3bf4a7] Event Hub: cef, Partition: 9 is closing. (reason=Shutdown)

If there is no change, I don't think there is anything else you can do. Maybe someone from Elastic can ping on and give some recommendations, both from what is in the documentation, there is nothing else to change.

Why this is an issue for you? Do you have other Logstash instances reading from the same Event Hub? If you do not have other Logstash Instances reading from the same event hub maybe you can not use the storage account at all, just be aware of the impacts that this can cause when restarting logstash.

since its writing every 5 seconds from azure event hub to storage , its causing more cost utilization ,
instead we want to make it delay so that we can reduce the cost of ingesting the events , we are not having any issue with AWS or any other log stash plugins , we are only facing these default 5 seconds writing issue with azure event hub logstash plugin

Unfortunately this seems to be how the plugin works.

I can think of a couple of options.

  • Do not use the Storage Account to write the Checkpoints
  • Build a custom collector to save the logs from event hub to files and configure logstash to read from these files.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.