Can't start filebeat with TLS configured


(Tim Dunphy) #1

Hey guys,

I'm trying to setup my first filebeat forwarder after having used logstash-forwarder for quite a while.

When I try to start up filebeat I'm getting this error:

`[root@web1:/etc/filebeat] #systemctl status filebeat.service
● filebeat.service - LSB: Sends log files to Logstash or directly to Elasticsearch.
Loaded: loaded (/etc/rc.d/init.d/filebeat)
Active: failed (Result: exit-code) since Sun 2016-01-31 20:58:29 EST; 6s ago
Docs: man:systemd-sysv-generator(8)
Process: 5579 ExecStart=/etc/rc.d/init.d/filebeat start (code=exited, status=1/FAILURE)

Jan 31 20:58:29 web1 systemd[1]: Starting LSB: Sends log files to Logstash or directly to Elasticsearch....
Jan 31 20:58:29 web1 filebeat[5579]: Starting filebeat: Loading config file error: YAML config parsing failed on /etc/filebeat/filebeat.yml: yaml: line 228: did not find expected key. Exiting.
Jan 31 20:58:29 web1 systemd[1]: filebeat.service: control process exited, code=exited status=1
Jan 31 20:58:29 web1 systemd[1]: Failed to start LSB: Sends log files to Logstash or directly to Elasticsearch..
Jan 31 20:58:29 web1 systemd[1]: Unit filebeat.service entered failed state.
Jan 31 20:58:29 web1 systemd[1]: filebeat.service failed.`

This only happens if I try to enable the TLS settings in the config.Otherwise it starts fine, but I don't want to ship logs minus TLS.

Here's the line that the error is complaining about:

  logstash:
    # The Logstash hosts
    hosts: ["logs.example.com:2541"]

But I think that the problem is up in the TLS section, because if I comment it out I can start it up:

# tls configuration. By default is off.
tls:
# List of root certificates for HTTPS server verifications
certificate_authorities: ["/etc/pki/CA/certs/ca.crt"]

This is my entire config minus the comments:

filebeat:
  prospectors:
    -
      paths:
        - /var/log/*.log
        - /var/log/*/*.log
      input_type: log
  registry_file: /var/lib/filebeat/registry
output:
    tls:
      certificate_authorities: ["/etc/pki/CA/certs/ca.crt"]
  logstash:
    hosts: ["logs.example.com:2541"]
    worker: 1
shipper:
logging:
  files:

Could this be a parsing issue of some kind? How can I get this to work?


(Magnus Bäck) #2

This is indeed a problem with how you format your YAML file, most likely related to indentation. Please edit your post and format the configuration snippets as code so that leading space isn't deleted.


(Steffen Siering) #3

please format code by enclosing it with 3 backticks (```) to make it readable.
The tls section for the logstash output must be configued, there is no tls config in output section. Looks like you configured tls for the elasticsearch output by accident.


(Tim Dunphy) #4

Hey guys,

Sorry I didn't know about the proper formatting for that. I just reformatted the post. So, any help or direction I could get on this would be great!!

Thanks


(Magnus Bäck) #5

Do you really don't have any indentation in your configuration file? If not, make sure you address that first. YAML is sensitive to indentation. If you do have lines indented, please make another attempt at formatting the file properly here. We really cannot help you otherwise.


(Steffen Siering) #6

I updated your post.

I think I already pointed out the problem in the config file:


(Tim Dunphy) #7

Ok thanks. I've put the TLS info into the correct area now:

Here's my latest attempt:

#egrep -v "^$|(.*)#|^$" filebeat.yml
filebeat:
  prospectors:
    -
      paths:
        - /var/log/*.log
        - /var/log/*/*.log
      input_type: log
  registry_file: /var/lib/filebeat/registry
output:
  logstash:
          hosts: ["logs.example.com:2541"]
          index: filebeat
    tls:
      certificate_authorities: ["/etc/pki/CA/certs/ca.crt"]
      certificate: "/opt/filebeat/web1.example.com.crt"
      certificate_key: "/opt/filebeat/web1.example.com.key"
shipper:
logging:
  files:

And this is the latest error I received:

#systemctl status filebeat.service
● filebeat.service - LSB: Sends log files to Logstash or directly to Elasticsearch.
   Loaded: loaded (/etc/rc.d/init.d/filebeat)
   Active: failed (Result: exit-code) since Mon 2016-02-01 12:47:54 EST; 1min 35s ago
     Docs: man:systemd-sysv-generator(8)
  Process: 17067 ExecStop=/etc/rc.d/init.d/filebeat stop (code=exited, status=0/SUCCESS)
  Process: 19755 ExecStart=/etc/rc.d/init.d/filebeat start (code=exited, status=1/FAILURE)

Feb 01 12:47:54 web1 systemd[1]: Starting LSB: Sends log files to Logstash or directly to Elasticsearch....
Feb 01 12:47:54 web1 filebeat[19755]: Starting filebeat: Loading config file error: YAML config parsing failed on /etc/filebeat/filebeat.yml: yaml.... Exiting.
Feb 01 12:47:54 web1 systemd[1]: filebeat.service: control process exited, code=exited status=1
Feb 01 12:47:54 web1 systemd[1]: Failed to start LSB: Sends log files to Logstash or directly to Elasticsearch..
Feb 01 12:47:54 web1 systemd[1]: Unit filebeat.service entered failed state.
Feb 01 12:47:54 web1 systemd[1]: filebeat.service failed.
Hint: Some lines were ellipsized, use -l to show in full.

if you could help me through this part I'd appreciate it!


(Steffen Siering) #8

You have to use 3 backticks + newline to properly format your source code.

Just by copying the raw message and visualizing whitespaces in my editor I see indentation being off + seems to be a mix of spaces and tabs.

Try this:

filebeat:
  prospectors:
    -
      paths:
        - /var/log/*.log
        - /var/log/*/*.log
      input_type: log
  registry_file: /var/lib/filebeat/registry

output:
  logstash:
    hosts:
      - logs.example.com:2541
    index: filebeat
    tls:
      certificate_authorities:
        - /etc/pki/CA/certs/ca.crt
shipper:

logging:
  files:

Unfortunately client authentication is not yet supported by logstash, thusly certificate and certificate_key is not really required. Can not hurt to have these configured to have config prepared for time client authentication will be available.


(Tim Dunphy) #9

Hey steffens,

Many thanks!! That worked!! Proper indentation did the trick.

#systemctl status filebeat
● filebeat.service - LSB: Sends log files to Logstash or directly to Elasticsearch.
   Loaded: loaded (/etc/rc.d/init.d/filebeat)
   Active: active (running) since Wed 2016-02-03 20:01:06 EST; 10s ago
     Docs: man:systemd-sysv-generator(8)
  Process: 17067 ExecStop=/etc/rc.d/init.d/filebeat stop (code=exited, status=0/SUCCESS)
  Process: 31803 ExecStart=/etc/rc.d/init.d/filebeat start (code=exited, status=0/SUCCESS)
   CGroup: /system.slice/filebeat.service
           ├─32008 filebeat-god -r / -n -p /var/run/filebeat.pid -- /usr/bin/filebeat -c /etc/filebeat/fi...
           └─32009 /usr/bin/filebeat -c /etc/filebeat/filebeat.yml

Feb 03 20:00:52 web1 systemd[1]: Starting LSB: Sends log files to Logstash or directly to Elasticsearch....
Feb 03 20:01:06 web1 filebeat[31803]: Starting filebeat: 2016/02/04 01:01:06.006783 transport.go:125:...peer
Feb 03 20:01:06 web1 filebeat[31803]: [  OK  ]
Feb 03 20:01:06 web1 systemd[1]: Started LSB: Sends log files to Logstash or directly to Elasticsearch..
Hint: Some lines were ellipsized, use -l to show in full.```

(Tim Dunphy) #10

Hey guys,

Ok, now that the filebeat is running on the web server, I can't actually get any logs to appear for this host when I look in kibana.

I've verified that filebeat is running:

root     32008     1  0 Feb03 ?        00:00:00 filebeat-god -r / -n -p /var/run/filebeat.pid -- /usr/bin/filebeat -c /etc/filebeat/filebeat.yml
root     32009 32008  0 Feb03 ?        00:01:11 /usr/bin/filebeat -c /etc/filebeat/filebeat.yml```

And it's listening on the port that I have running for Logstash:

```[root@web1:~] #lsof -i :2541
COMMAND    PID USER   FD   TYPE    DEVICE SIZE/OFF NODE NAME
filebeat 32009 root   31u  IPv4 451495940      0t0  TCP web1.example.com:35401->216.xxx.xxx.98:lonworks2 (ESTABLISHED)```

If I do a search in kibana by:

```host:"web1.example.com"```

Or:

```host:"web1"```

Nothing turns up. If I stop filebeat, and fire up lumberjack (never got around to running logstash-forwarder since lumberjack was working so well) I can see results for this host. 

I tried to do a listing of the directory I'm trying to gather logs for using the same globs that I have in the filebeat configuration:

```#ls -ltrh /var/log/*.log
-rw-r--r--. 1 root root  15K Dec  5 03:28 /var/log/boot.log
-rw-r--r--. 1 root root  60K Dec  5 03:28 /var/log/cloud-init-output.log
-rw-r--r--. 1 root root  84K Dec  5 03:28 /var/log/cloud-init.log
-rw-------. 1 root root  32K Jan 21 23:02 /var/log/yum.log
-rw-r--r--. 1 root root 1.4M Feb  4 15:25 /var/log/mcollective.log```

```[root@web1:~] #ls -ltrh /var/log/*/*.log
-rw-------. 1 root     root        0 Jul  8  2014 /var/log/anaconda/ks-script-9WXnf6.log
-rw-------. 1 root     root        0 Jul  8  2014 /var/log/anaconda/ks-script-8JeWlp.log
-rw-------. 1 root     root        0 Jul  8  2014 /var/log/anaconda/ks-script-5K8vdv.log
-rw-------. 1 root     root     176K Jul  8  2014 /var/log/anaconda/anaconda.storage.log
-rw-------. 1 root     root      33K Jul  8  2014 /var/log/anaconda/anaconda.program.log
-rw-------. 1 root     root     125K Jul  8  2014 /var/log/anaconda/anaconda.packaging.log
-rw-------. 1 root     root      54K Jul  8  2014 /var/log/anaconda/anaconda.log
-rw-------. 1 root     root     6.0K Jul  8  2014 /var/log/anaconda/anaconda.ifcfg.log
-rw-r--r--. 1 bacula   bacula      0 Mar  1  2015 /var/log/bacula/bacula.log
-rw-r--r--. 1 root     root     531M Oct  9 20:57 /var/log/lumberjack/lumberjack.log
-rw-r--r--. 1 root     root        0 Oct 25 12:36 /var/log/logstash-forwarder/logstash-forwarder.log
-rw-r--r--. 1 newrelic root     271K Nov  1 22:57 /var/log/newrelic/newrelic-plugin-agent.log
-rw-rw-r--. 1 zabbix   zabbix      0 Dec  5 04:18 /var/log/zabbix/zabbix_agentd.log
-rw-------. 1 root     root     4.1M Jan  7 10:10 /var/log/audit/audit.log
-rw-r--r--. 1 logstash logstash    0 Jan  7 18:20 /var/log/logstash/logstash.log
-rw-r--r--. 1 root     root      66K Jan 17 02:25 /var/log/tuned/tuned.log
-rw-r--r--. 1 dd-agent dd-agent  532 Jan 17 02:27 /var/log/datadog/jmxfetch.log
-rw-r--r--. 1 dd-agent root      35K Jan 17 02:27 /var/log/datadog/supervisord.log
-rw-rw-rw-. 1 root     root        0 Jan 27 03:18 /var/log/newrelic/newrelic-daemon.log
-rw-r--r--. 1 newrelic newrelic    0 Jan 30 03:47 /var/log/newrelic/nrsysmond.log
-rw-r--r--. 1 root     root     2.0K Feb  1 03:11 /var/log/httpd/jf_php_error.log
-rw-r--r--. 1 root     root        0 Feb  2 03:32 /var/log/newrelic/php_agent.log
-rw-r-----. 1 root     root      30K Feb  4 10:36 /var/log/proftpd/proftpd.sql.log
-rw-r--r--. 1 root     root      14K Feb  4 15:24 /var/log/munin-node/munin-node.log
-rw-r--r--. 1 dd-agent dd-agent 4.5M Feb  4 15:26 /var/log/datadog/collector.log
-rw-r--r--. 1 dd-agent dd-agent 1.3M Feb  4 15:27 /var/log/datadog/dogstatsd.log
-rw-r--r--. 1 dd-agent dd-agent 4.4M Feb  4 15:27 /var/log/datadog/forwarder.log```

I can see the logs that I'm trying to gather using filebeat. How can I diagnose and solve this problem?

(Magnus Bäck) #11

Look in the Filebeat logs. See filebeat.yml for the logging configuration.


(Tim Dunphy) #12

Hi Magnus,

Ok, so I enabled logging for filebeat. And this is what I'm getting in the log.

I see a bunch of log entries like this:

2016-02-04T23:33:10-05:00 DBG  Update existing file for harvesting: /var/log/proftpd/proftpd.sql.log
2016-02-04T23:33:10-05:00 DBG  Not harvesting, file didn't change: /var/log/proftpd/proftpd.sql.log
2016-02-04T23:33:10-05:00 DBG  Check file for harvesting: /var/log/tuned/tuned.log
2016-02-04T23:33:10-05:00 DBG  Update existing file for harvesting: /var/log/tuned/tuned.log
2016-02-04T23:33:10-05:00 DBG  Not harvesting, file didn't change: /var/log/tuned/tuned.log
2016-02-04T23:33:10-05:00 DBG  Check file for harvesting: /var/log/zabbix/zabbix_agentd.log
2016-02-04T23:33:10-05:00 DBG  Update existing file for harvesting: /var/log/zabbix/zabbix_agentd.log
2016-02-04T23:33:10-05:00 DBG  Not harvesting, file didn't change: /var/log/zabbix/zabbix_agentd.log

And then I see the following:

2016-02-04T23:33:12-05:00 DBG  Try to publish %!s(int=200) events to logstash with window size %!s(int=10)
2016-02-04T23:33:12-05:00 DBG  %!s(int=0) events out of %!s(int=200) events sent to logstash. Continue sending ...
2016-02-04T23:33:12-05:00 INFO Error publishing events (retrying): EOF
2016-02-04T23:33:12-05:00 DBG  Try to publish %!s(int=200) events to logstash with window size %!s(int=10)
2016-02-04T23:33:12-05:00 DBG  %!s(int=0) events out of %!s(int=200) events sent to logstash. Continue sending ...
2016-02-04T23:33:12-05:00 INFO Error publishing events (retrying): EOF
2016-02-04T23:33:12-05:00 INFO send fail
2016-02-04T23:33:12-05:00 INFO backoff retry: 1s
2016-02-04T23:33:13-05:00 DBG  Try to publish %!s(int=200) events to logstash with window size %!s(int=10)
2016-02-04T23:33:13-05:00 DBG  %!s(int=0) events out of %!s(int=200) events sent to logstash. Continue sending ...
2016-02-04T23:33:13-05:00 INFO Error publishing events (retrying): EOF
2016-02-04T23:33:13-05:00 INFO send fail

I'm thinking these lines are probably important:

2016-02-04T23:37:29-05:00 DBG  Try to publish %!s(int=200) events to logstash with window size %!s(int=10)
2016-02-04T23:37:29-05:00 DBG  %!s(int=0) events out of %!s(int=200) events sent to logstash. Continue sending ...
2016-02-04T23:37:29-05:00 INFO Error publishing events (retrying): EOF

Now that we know all this, how can I fix this problem? Is there anything else I could look for in the logs? How can we fix the problem of no logs getting through to Logstash?

Thanks


(Magnus Bäck) #13

Yes, those are indeed the interesting lines from the log. Unfortunately I don't know what they mean. What version of the beats plugin are you running on the Logstash side? And what version of Filebeat?


(Tim Dunphy) #14

I'm running version 1.01 of filebeats on the web server

filebeat version 1.0.1 (amd64)

And haven't started running filebeats on the Logstash server. I wanted to try to get it running on one server first before I started running it anywhere else. I figured that it would be like a drop-in replacement for lumberjack/logstash-forwarder.

I'm not sure what those lines mean either. Maybe you could ask around?


(Magnus Bäck) #15

And haven't started running filebeats on the Logstash server. I wanted to try to get it running on one server first before I started running it anywhere else. I figured that it would be like a drop-in replacement for lumberjack/logstash-forwarder.

Wait, what? You're still running the lumberjack input plugin in Logstash rather than the beats plugin? If so that's the problem. Filebeat will not work with the lumberjack input. I believe the protocols are very similar but they're not compatible.

Maybe you could ask around?

Hopefully one of the Filebeat folks are reading this thread.


(Tim Dunphy) #16

Oh uhhhh.... yeah! I didn't realize they required their own plugin. I read that they were replacing logstash-forwarder to I thought they used the same input. Now I get you! I'll add the beats input and see where that takes me. Makes total sense at this point! Thanks!


(Tim Dunphy) #17

Magnus,

Thanks for the clue-in! Adding the beats plugin on the logstash side worked of course. Log messages are flowing in now. As an example I'm seeing this in the filebeats log file at this point.

  "@timestamp": "2016-02-05T07:10:54.000Z",
  "beat": {
    "hostname": "web1",
    "name": "filebeat"
  },
  "count": 1,
  "fields": null,
  "input_type": "log",
  "message": "2016-02-05 01:48:17 EST | ERROR | dd.forwarder | forwarder(ddagent.py:262) | Response: HTTPResponse(_body=None,buffer=\u003c_io.BytesIO object at 0x7f3596ce9170\u003e,code=403,effective_url='https://5-6-3-app.agent.datadoghq.com/intake/?api_key=your_API_key',error=HTTPError('HTTP 403: Forbidden',),headers={'Content-Length': '15', 'X-Content-Type-Options': 'nosniff', 'Strict-Transport-Security': 'max-age=15724800;', 'Dd-Pool': 'propjoe', 'Connection': 'keep-alive', 'Date': 'Fri, 05 Feb 2016 06:48:17 GMT', 'Content-Type': 'text/plain'},reason='Forbidden',request=\u003ctornado.httpclient.HTTPRequest object at 0x7f3595bdaf10\u003e,request_time=0.05147695541381836,time_info={})",
  "offset": 483074,
  "source": "/var/log/datadog/forwarder.log",
  "tags": [
    "jokefire-dev",
    "web-tier"
  ],
  "type": "log"
}

Thanks!
Tim


(Steffen Siering) #18

right, the beats plugin is derived from lumberjack, but beats are not compatible to lumberjack plugin.


(Tim Dunphy) #19

Yup! Got it.Thanks!


(system) #20