How to tail log file continuously for new log entries?

Hi ,

I am trying to tail log file with updating new log entries coming in log file . But with my current configuration its not working . Its starting it from first log entry when i start agent and it does not work if i restart it again .

I am pushing logfile to logstash server where its getting parsed .

Below is the configuration for filebeat.yml

###################### Filebeat Configuration Example #########################

This file is an example configuration file highlighting only the most common

options. The filebeat.full.yml file from the same directory contains all the

supported options with more comments. You can use it as a reference.

You can find the full configuration reference here:

https://www.elastic.co/guide/en/beats/filebeat/index.html

#=========================== Filebeat prospectors =============================

filebeat.prospectors:

Each - is a prospector. Most options can be set at the prospector level, so

you can use different prospectors for various configurations.

Below are the prospector specific configurations.

  • input_type: log

    Paths that should be crawled and fetched. Glob based paths.

    paths:

- /var/log/*.log

- /archives/logs/tomcat7-8090/download.log
- /archives/logs/tomcat7-8080/download.log

- /etc/filebeat/test/download.log

- /etc/filebeat/test/test_apache.log

#- c:\programdata\elasticsearch\logs*

Exclude lines. A list of regular expressions to match. It drops the lines that are

matching any regular expression from the list.

#exclude_lines: ["^DBG"]

Include lines. A list of regular expressions to match. It exports the lines that are

matching any regular expression from the list.

#include_lines: ["^ERR", "^WARN"]

Exclude files. A list of regular expressions to match. Filebeat drops the files that

are matching any regular expression from the list. By default, no files are dropped.

#exclude_files: [".gz$"]

Optional additional fields. These field can be freely picked

to add additional information to the crawled log files for filtering

#fields:

level: debug

review: 1

Multiline options

Mutiline can be used for log messages spanning multiple lines. This is common

for Java Stack Traces or C-Line Continuation

The regexp Pattern that has to be matched. The example pattern matches all lines starting with [

#multiline.pattern: ^[

Defines if the pattern set under pattern should be negated or not. Default is false.

#multiline.negate: false

Match can be set to "after" or "before". It is used to define if lines should be append to a pattern

that was (not) matched before or after or as long as a pattern is not matched based on negate.

Note: After is the equivalent to previous and before is the equivalent to to next in Logstash

#multiline.match: after

#================================ General =====================================

The name of the shipper that publishes the network data. It can be used to group

all the transactions sent by a single shipper in the web interface.

#name:

The tags of the shipper are included in their own field with each

transaction published.

#tags: ["service-X", "web-tier"]

Optional fields that you can specify to add additional information to the

output.

#fields:

env: staging

#================================ Outputs =====================================

Configure what outputs to use when sending the data collected by the beat.

Multiple outputs may be used.

#-------------------------- Elasticsearch output ------------------------------
#output.elasticsearch:

Array of hosts to connect to.

hosts: ["localhost:9200"]

Optional protocol and basic auth credentials.

#protocol: "https"
#username: "elastic"
#password: "changeme"

#----------------------------- Logstash output --------------------------------
output.logstash:

The Logstash hosts

hosts: ["lvsyslogstash1.lv.jabodo.com:5044"]

Optional SSL. By default is off.

List of root certificates for HTTPS server verifications

#ssl.certificate_authorities: ["/etc/pki/root/ca.pem"]

Certificate for SSL client authentication

#ssl.certificate: "/etc/pki/client/cert.pem"

Client Certificate Key

#ssl.key: "/etc/pki/client/cert.key"

#================================ Logging =====================================

Sets log level. The default log level is info.

Available log levels are: critical, error, warning, info, debug

#logging.level: debug

At debug level, you can selectively enable logging only for some components.

To enable all selectors use ["*"]. Examples of other selectors are "beat",

"publish", "service".

#logging.selectors: ["*"]

=====================================
And logstash conf

input {
beats {
port => 5044
}
}
filter {
grok {
match => { "message" => "^(?[[^]]*])%{SPACE}:|:%{SPACE}(?:\s+%{WORD:level})?%{SPACE}:|:%{SPACE}(?:\s+
%{USERNAME:hostname})?%{SPACE}:|:%{SPACE}(?:\s+%{GREEDYDATA:coidkey})?%{SPACE}:|:%{SPACE}(?:\s+%{GREEDYDATA:clientinfo}
)?%{SPACE}:|:%{SPACE}(?:\s+%{IP:clientip})?%{SPACE}:|:%{SPACE}(?:\s+%{GREEDYDATA:Url})?%{SPACE}:|:%{SPACE}(?:\s+%{JA
VACLASS:class})?%{SPACE}:|:%{SPACE}%{USER:ident}%{SPACE}(?:\s+%{GREEDYDATA:msg})?$"}
}
}
output {
stdout { codec => rubydebug }

Thanks,
nikhil

I'm not sure about what you are asking for.
essentially, Filebeat tails specific files found inside the paths value; it starts from the beginning of the file, saves into a registry file some data (offset and something similar) and check every scan_frequency seconds

I am using filebeat version 5.4
So when i start filebeat on client server to send logs in specified log file ..its starting it from start ( the whole log file) ..i am looking for it to start from the end and push any new entries in the logs coming in on specified path .

if you want to start from the end of the file you can enable tail_files: true

you can find the explanation in the *.yml config file:

#Setting tail_files to true means filebeat starts readding new files at the end
#instead of the beginning. If this is used in combination with log rotation
#this can mean that the first entries of a new file are skipped.
#tail_files: false

So far , i am making changes to my filebeat.yml config file .. i am also seeing filebeat.full.yml file ( both files under /etc/filebeat ) .. i do added tail_files: true in filebeat.yml file
No LUCK.

Is is possible that multiline is causing the issue ?

do you see anything in the filebeat's logs?

Hi ,

i am seeing following in filebeat logs
2017-06-01T09:44:02-04:00 ERR Failed to publish events caused by: write tcp 10.140.76.11:48852->10.140.223.89:5044: write: connection reset by peer
2017-06-01T09:44:02-04:00 INFO Error publishing events (retrying): write tcp 10.140.76.11:48852->10.140.223.89:5044: write: connection reset by peer
2017-06-01T09:44:17-04:00 INFO Non-zero metrics in the last 30s: libbeat.logstash.call_count.PublishEvents=2 libbeat.logstash.publish.read_bytes=6 libbeat.logstash.publish.write_bytes=412 libbeat.logstash.publish.write_errors=1 libbeat.logstash.published_and_acked_events=2 libbeat.logstash.published_but_not_acked_events=2 libbeat.publisher.published_events=2 publish.events=2 registrar.states.update=2 registrar.writes=1
2017-06-01T09:44:47-04:00 INFO No non-zero metrics in the last 30s
2017-06-01T09:45:17-04:00 INFO Non-zero metrics in the last 30s: libbeat.logstash.call_count.PublishEvents=2 libbeat.logstash.publish.read_bytes=18 libbeat.logstash.publish.write_bytes=1552 libbeat.logstash.published_and_acked_events=14 libbeat.publisher.published_events=14 publish.events=14 registrar.states.update=14 registrar.writes=2
2017-06-01T09:45:47-04:00 INFO No non-zero metrics in the last 30s
2017-06-01T09:46:17-04:00 INFO Non-zero metrics in the last 30s: libbeat.logstash.call_count.PublishEvents=1 libbeat.logstash.publish.read_bytes=6 libbeat.logstash.publish.write_bytes=417 libbeat.logstash.published_and_acked_events=2 libbeat.publisher.published_events=2 publish.events=2 registrar.states.update=2 registrar.writes=1

I have modified my logstash filter for multi-line to look for timestamp as new line , looks like it working partially

Logstash config

input {
beats {
port => 5044
codec => multiline {
pattern => "^%{TIMESTAMP_ISO8601} "
negate => true
what => previous
}
}
}
filter {
grok {
match => { "message" => "^(?[[^]]*])%{SPACE}:|:%{SPACE}(?:\s+%{
WORD:level})?%{SPACE}:|:%{SPACE}(?:\s+%{USERNAME:hostname})?%{SPACE}:|:%{SPACE
}(?:\s+%{GREEDYDATA:coidkey})?%{SPACE}:|:%{SPACE}(?:\s+%{GREEDYDATA:clientinfo})?
%{SPACE}:|:%{SPACE}(?:\s+%{IP:clientip})?%{SPACE}:|:%{SPACE}(?:\s+%{GREEDYDATA
:Url})?%{SPACE}:|:%{SPACE}(?:\s+%{JAVACLASS:class})?%{SPACE}:|:%{SPACE}%{USER:
ident}%{SPACE}(?:\s+%{GREEDYDATA:msg})?$"}
}
}
output {
stdout { codec => rubydebug }

first of all, I think you should elevate the log level to debug. in this way, you could see more messages.
taking a look seems to be a network issue so it's better to see more details

note: please, use blockquote feature when you answer just to make easy the reading

Thanks rschirin.

Looks like my grok is not able to identify the new line starting with timestamp. I will open a new topic in logstash to resolve the grok issue.

Thanks

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.