Hi,
We have a logstash deployment where we use the Node Stats API to independently track logstash performance/status. Till 7.2.0, the API behaved as documented, but after update to 7.3.0, we are seeing inconsistent reporting from the API
-
We have multiple pipelines and till 7.2.0 all were reported. Now only one pipeline is being reported. The pipeline that gets reported back changes with each restart. I can confirm the other pipelines are running, because I can see log messages such as
Pipelines running {:count=>4, :running_pipelines=>[:"1_snortalerts", :"2_monitor", :".monitoring-logstash", :"0_main"], :non_running_pipelines=>[]}The actual API response is like
$ curl -XGET 'http://localhost:9600/_node/stats/pipelines?pretty' { "host" : "logstash2", "version" : "7.3.0", "http_address" : "127.0.0.1:9600", "id" : "e1b4029a-0efe-4438-bcef-e7460cb5bbb3", "name" : "logstash2", "ephemeral_id" : "1f735de7-e389-44c0-b043-96c046dbb51c", "status" : "green", "snapshot" : false, "pipeline" : { "workers" : 7, "batch_size" : 2000, "batch_delay" : 50 }, "pipelines" : { "0_main" : { "events" : { "in" : 1640089, "queue_push_duration_in_millis" : 103425, "out" : 1638527, "filtered" : 1638527, "duration_in_millis" : 3123654 }, "plugins" : { "inputs" : [ .. ], "codecs" : [ .. ], "filters" : [ .. ], "outputs" : [ .. ] }, "reloads" : { "failures" : 0, "last_error" : null, "last_failure_timestamp" : null, "last_success_timestamp" : null, "successes" : 0 }, "queue" : { "type" : "persisted", "events_count" : 0, "queue_size_in_bytes" : 264287082, "max_queue_size_in_bytes" : 1073741824 }, "dead_letter_queue" : { "queue_size_in_bytes" : 2 }, "hash" : "f7e13af47e85c49bb8b0c9e095622b78d081f902f79b9cb992f1a510fd983ff5", "ephemeral_id" : "bf097cd2-c4ab-4fcf-9e57-7c5512ebb498" } }}
-
Looking to retrieve stats for a particular pipeline using
/_node/stats/pipelines/PIPELINENAME?prettywill return stats for only the pipeline which is being listed when using no pipeline name. Similarly, specifying non-existing pipeline in this command will return the same output (not an error orpipeline not foundmessage). For the above example, the output will be same for all the following queries/_node/stats/pipelines?pretty/_node/stats/pipelines/1_notmain?pretty/_node/stats/pipelines/foo_notpresent?pretty
-
The structure of queue reporting has also changed. Prior versions had a sub-key
capacitywherein stats such asqueue_size_in_bytesandmax_queue_size_in_byteswere reported. Now these stats are directly reported underqueue.
Is this a documented change or a bug?