However, in order to save resources, I would like to use a timeout so that long intermediate times are canceled and -1 is given back. In most cases that works too. However, there is partly the situation where the Timeout command did not work.
In the following I have an exemplary script for displaying the error:
filter {
#match My TestInputs
grok {
match => [ "message", "%{NOTSPACE:id} - %{NOTSPACE:status}" ]
}
# the Startpoint
if ([status] == "start")
{
aggregate {
task_id => "%{id}"
code => "map['duartion_map'] = (event.get('@timestamp').to_f*1000).to_i"
map_action => "create"
timeout => 1
}
}
# the Endpoint
if [status] == "end"
{
#Fill the field duration_field, so that is filled every time, even if the time difference is over the timeout
mutate{
add_field => {duration_field => "-1"}
}
aggregate
{
task_id => "%{id}"
code => "event.set('duration_field', (event.get('@timestamp').to_f*1000).to_i - map['duartion_map'])"
map_action => "update"
end_of_task => true
}
}
}
I have entered a successive of the following records, for testing:
1 - Start.
1 - end
2 - Start.
2 - end
However, when the script is issued, you can see that the timeout command has not worked for the second time, although nothing has been changed in the source code, and the logstash test server was not restarted. This problem can not be precisely reproduced and randomly occurs. I already had console outputs, which were about 1.5 seconds apart with the timeout worked.
Does anyone know the problem and knows an idea that I can try to solve the problem?
The timeout is implemented using the flush method, which is called by logstash every five seconds, so if you set timeout to 1, the effective value will be anywhere from 1 to 5 seconds.
Thank you badger. Unfortunately, I realized that this wasn't my solution. I tried to set the timeout to 10 seconds and what can I say. Sometimes it works and sometimes logstash just ignores my timeout.
For more information see code and log below:
filter {
#match My TestInputs
grok {
match => [ "message", "%{NOTSPACE:id} - %{NOTSPACE:status}" ]
}
# the Startpoint
if ([status] == "start")
{
aggregate {
task_id => "%{id}"
code => "map['duartion_map'] = (event.get('@timestamp').to_f*1000).to_i"
map_action => "create"
timeout => 10
}
}
# the Endpoint
if [status] == "end"
{
#Fill the field duration_field, so that is filled every time, even if the time difference is over the timeout
mutate{
add_field => {duration_field => "-1"}
}
aggregate
{
task_id => "%{id}"
code => "event.set('duration_field', (event.get('@timestamp').to_f*1000).to_i - map['duartion_map'])"
map_action => "update"
end_of_task => true
}
}
}
Once again, the timeout is evaluated by a function that is called every five seconds. So a ten second timeout will actually be somewhere between 10 and 15 seconds.
Ok thanks I understand. Then an example from another system. In this case, sometimes it happens that the timeout command does not work.
Usually the timeout is set to 120 seconds. If this is exceeded, the "edi_basic_latency" field is set to 120000. This seems to work so far as the most common value by far is 120000, as shown below.
In the following example you can now see that there are some records that are well over 120 seconds long. In this case at 180 seconds!
Maybe it is possible to change the update time so that the flush method doesn't call every 5 seconds? And if yes how can i check/change the update time from the flush method?
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.