Some lines sent to Logstash are truncated


(Pierre-Vincent Ledoux) #1

Hi,

I'm parsing a lot of old logs files. All logs are in gz, so I have to uncompress then move it a folder watched by filebeat.

On about 30millions of entries, I have about 1300 failures in logstash logs. I'm logging the messages so I can see that Logstash received a partial line, the line is truncated randomly.

I doubled check to ensure that I don't have any special characters or so in my logs. So why Filebeat is sending partial lines?


(ruflin) #2

Could you share the following?

  • Filebeat configs
  • Filebeat logs
  • Filebeat version
  • Logstash config
  • An example of a line that was "partial"

(Pierre-Vincent Ledoux) #3

Yes no problem, but I would prefer to send it in MP if it's ok for you?


(ruflin) #4

You mean the logs? That is ok for me. For the other files it should be possible to post them here (but remove passwords :wink: ).


(Pierre-Vincent Ledoux) #5

Ops, sorry for the delay, I missed your reply.

filebeat.yml

filebeat:
  prospectors:

      - paths:
          - /var/data/level3/beats/*.log
        document_type: level3_log
        exclude_lines: ['^#']
        close_inactive: 10s

  registry_file: /var/data/filebeat_registry

logging.level: info
logging.metrics.enabled: false
logging.to_files: false
logging.to_syslog: false

output:
  logstash:
    hosts: ["logstash:5044"]

logstash.yml

config.reload.automatic: true
config.reload.interval: 5
queue.type: persisted
path.queue: /usr/share/logstash/queue
path.logs: /usr/share/logstash/log
pipeline.workers: 8
pipeline.batch.size: 2500
pipeline.batch.delay: 5
http.host: "0.0.0.0"
xpack.monitoring.enabled: true
xpack.monitoring.elasticsearch.url: http://elasticsearch:9200

The logstash filters and groks are quite heavy, I'm zipping it in MP right now with some logs samples.

Thanks for you help!

Cheers,

Pv


(ruflin) #6

Thanks for the data. Could you provide an example message which was truncated? Also I was looking for the Filebeat logs. Do you see anything special in there?

Is the volume you read logs from a shared drive and somehow mounted or a local disk?

If you write the log output to file instead of LS, do you still see it happening?


(ruflin) #7

BTW: Which filebeat, logstash, logstash-beats-input version are you running?


(Pierre-Vincent Ledoux) #8

I'm now on 5.3 for all the stack except Filebeat that is still in 5.2.2.

I'm running filebeat on a unique node as a docker container but this issue was already existing when filebeat was running directly on the host).
The disk is not a ssd but a raid 5 sata.

Filebeat is streaming to 3 Logstash nodes (a container on the same node and 2 others remote).

I will try to make some test to write on disk directly today or tomorrow.

I send you the failure log in MP.


(ruflin) #9

Quite often such behaviour can come from shared drives, but inside docker when on Linux should be ok.

Other ideas:

  • How do you remove the files after you index them?
  • Could it be that you some inode reuse issue? When do you remove old files?

(Pierre-Vincent Ledoux) #10

I have script reading the registry. If the offset = file size, I delete it

I have close_inactive: 10s in the config, and my script is running every minute.

Cheers,

Pv


(ruflin) #11

Could it be that your partial lines come actually from an other file because it reuses the inode? We had a similar case here: https://github.com/elastic/beats/issues/714#issuecomment-295329605 If that is the case, I recommend you to first move the files to an other place to clean up the registry and then remove the files later. This will prevent the inode reuse.


(Pierre-Vincent Ledoux) #12

Hi, sorry for the late answer, I was waiting to be sure that the issue was resolved. Now I'm moving finished logs to a tmp dir instead of delete them, and I think that solved my problem.
Instead of inode, wouldn't possible to use file path? Or make it configurable for user like me that parse logs not in real time?


(ruflin) #13

Glad that solve the problem.

About using path as identifier instead of inode: Agree. This should be an option to configure or even be a separate prospector type for example file where it is assumed that files are never renamed or data is never appended. Feel free to open a feature request for this on Github.


(Pierre-Vincent Ledoux) #14

I will :wink: Thanks a lot!


(Pierre-Vincent Ledoux) #15

Done: https://github.com/elastic/beats/issues/4368


(system) #16

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.