Splitting for Logstash working intermittently

eleong · November 6, 2022, 11:51pm

Hi,

Currently, I have Logstash configured for splitting. The issue is that, sometimes it works, sometimes it goes about 10-15 minutes without any output (it suppose to have an output every 5 minutes).

At the backend, I have scheduled a cronjob to call to an API and then write to a file (JSON format). Logstash will then in turn, read the file and split "result" field.

Here's my Logstash code:

input {
 file {
   start_position => "end"
   path => ["/var/log/logstash/abc_api-*.json"]
   sincedb_path => "/dev/null"
 }
#stdout { codec => rubydebug { metadata => true } }
}

filter {
 json {
   source => "message"
 }
 split {
   field => "result"
 }
#stdout { codec => rubydebug { metadata => true } }
}

output {
  elasticsearch {
  ssl => true
  ssl_certificate_verification => false
  cacert => "/etc/logstash/elasticsearch-ca.pem"
  hosts => "https://10.0.0.2:9200"
  user => "${LS_USER}"
  password => "${LS_PWD}"
  manage_template => true
  index => "abc-logs-%{+YYYY.MM.dd}"
  pipeline => "abc-api"
  }
#stdout { codec => rubydebug { metadata => true } }
 }

I've searched the forum and I think this post helps. But I don't fully understand the "splitData.rb" code, hence, it's not implemented. It seems to be removing the "message" field but I could be wrong.

Also, I have configured Logstash to have 4GB of heap memory. In the array file, it has about 500 items for now (will grow to ~5000). Usually I use Filebeat for all things Elastic (its less resource intensive), but Filebeat does not support splitting feature. So, this is an exception.

Badger · November 7, 2022, 12:01am

No. That code was only needed because there was an issue in the split filter that caused excessive memory use. That issue has since been fixed.

Have you confirmed that there are files that match the regexp that have not been read by logstash?

eleong · November 7, 2022, 2:08am

Hi Badger,

The API calls always output the same JSON pattern. From the picture below, there is a gap, but I'm not sure what happened. I'll further monitor this. Just your confirmation on the memory usage fix is already good enough since I can rule that out from my troubleshooting.

Perhaps will work on the pipeline a little. Right now, in Elastic, each document @timestamp is based on the time it was ingested. Maybe I can change it to take the timestamp directly from API query instead since within the API result, there is a timestamp value.

system · December 5, 2022, 2:09am

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Logstash hanging while split array (consume too much memory) Logstash	3	1485	July 26, 2019
Logstash runs out of memory Logstash	1	521	September 2, 2019
Logslash splits files unexpectedly Logstash	1	424	November 29, 2018
Logstash didn't split log as config say Logstash	3	461	March 31, 2017
Logstash not writing to elasticsearch if split filter is used Logstash	4	957	July 6, 2017

Splitting for Logstash working intermittently

Related topics