Repackaging beats .deb files - service won't start

Hi all,

I've extracted and repackaged the official .deb files for filebeat, packetbeat, and topbeat, so I can include my own configs and certs and deploy via puppet. I've also added my own postinst and prerm files to update rc.d and start the service via the included init.d script.

All is well, except that the service won't start when the deb is installed via apt-get. If I install the same .deb files using dpkg -i <debname> it starts up fine.

Using topbeat as an example, I've added some debug and the init.d script is getting as far as the start-stop-daemon attempting to run topbeat-god, but nothing appears to happen at that point.

I've used start-stop-daemon -v, but all I get is output telling me it's starting topbeat-god. Similarly, adding -l /tmp/topbeat-god to topbeat-god just creates an empty file, but the service isn't running when I check.

For completeness, my postinst script is:

#!/bin/bash
echo "Updating rc.d"
/usr/sbin/update-rc.d topbeat defaults > /dev/null

if pgrep "topbeat" > /dev/null
then
echo "Restarting topbeat"
/etc/init.d/topbeat restart
else
echo "Starting topbeat"
/etc/init.d/topbeat start
fi

It's definitely hitting each step as it's echoing the output, and the rc.d updates are taking place.

Running the init.d script after installing with apt-get starts the service as expected. It just won't start on install.

I suspect there's some sort of environment issue running topbeat-god from apt-get, but I don't know where to look for clues. Can I get some more verbose output, or does someone know what the problem may be?

System is Ubuntu 14.04.3 64bit.
Beats are all v. 1.0.1

Thanks!

As far as I understand, if you go with our provided package, it works as expected? The best place to look for some more clues is in the beats-packer repo https://github.com/elastic/beats-packer as this is where we generate the packages.

@ruflin I don't believe the provided package attempts to start the service on install, although I could be wrong. There's no postinst script that I can see when it's unpacked, so it doesn't appear to be trying to start it.

Everything looks OK in my repacked version, aside from the service not starting automatically on install, even though I've added a postinst that should do that. I'm using the packaged init.d script so I cant think of a reason why it wouldn't be starting other than something being different in the environment when launched via apt-get.

Some more info: If I add an strace -f to the init.d script, package it back up (with my postinst script), and install via apt-get, the service starts up, however the package ends up in a half-installed state because strace never exits.

e.g.,

( ... )
do_start()
{
strace -f start-stop-daemon --start
--pidfile $PIDFILE
--exec $WRAPPER -- $WRAPPER_ARGS -- $DAEMON $DAEMON_ARGS
}
( ... )

Strace itself gives no clues because the process is actually running at this point.

Without the strace, but with -d "*" added to $DAEMON_ARGS in the init script, the only output I get is from geolite.go telling me GeoIP is disabled are no paths set (which I would expect). There is no other output and no sign of the process in ps after that.

Does that give any clues as to why dpkg starts the service but apt doesn't?

I'm really stuck here ... :confused:

The GeoIP line should not be relevant. What do you have in your config under logging? Are you logging to a file and have the level on debug? If yes, you should have some more information inside.

I would add some additional debug to your postinstall script to dump out the environment to see what the differences are between apt-get and dpkg installation. I can't think of anything that would prevent topbeat-god from launching when installed via apt-get. For reference you can see the source for topbeat-god at https://github.com/fiorix/go-daemon.

If you are using Puppet, why not install the package using the Puppet package type and then install and start the service using the Puppet service type.

Thanks for getting back to me.

@ruflin Yep, I just included the GeoIP bit as it's the only logging I get on startup.

I had debug logging set to go to syslog. Here's what I get:

apt:

Jan 7 10:57:14 civet /usr/bin/topbeat[4981]: geolite.go:24: GeoIP disabled: No paths were set under output.geoip.paths

dpkg:

Jan 7 10:57:36 civet /usr/bin/topbeat[5066]: geolite.go:24: GeoIP disabled: No paths were set under output.geoip.paths
Jan 7 10:57:36 civet /usr/bin/topbeat[5066]: outputs.go:111: Activated logstash as output plugin.
Jan 7 10:57:36 civet /usr/bin/topbeat[5066]: publish.go:249: Publisher name: civet
Jan 7 10:57:36 civet /usr/bin/topbeat[5066]: beat.go:107: Init Beat: topbeat; Version: 1.0.1
Jan 7 10:57:36 civet /usr/bin/topbeat[5066]: beat.go:133: topbeat sucessfully setup. Start running.

@andrewkroh I added env just before start-stop-daemon and diffed the output. The only change is that apt has set DPKG_NO_TSTP=yes which doesn't look relevant from what I can tell.

Regarding puppet, most, but not all hosts here use puppet, so relying on that to start the service doesn't completely solve the problem. We have just under 2000 hosts in our host DB, and multiple puppet masters. In some cases I would be getting others to install the package for me and would really prefer that it Just Works for them.

Maybe something is being logged by the topbeat but it is being swallowed by topbeat-god. Topbeat-god is a black hole for stdout and stderr from the program unless you add -l FILE to it's arguments.

Try adding -l /var/log/topbeat-god to the WRAPPER_ARGS to see if anything is being logged when it shuts down. While you are at it, I would add -e -v -d "*" to DAEMON_ARGS so that all debug is logged to /var/log/topbeat-god.

$ topbeat-god -h
Use: god [options] [--] program [arguments]
Options:
-h --help           show this help and exit
-v --version        show version and exit
-f --foreground     run in foreground
-n --nohup          make the program immune to SIGHUP
-l --logfile FILE   write the program's stdout and stderr to FILE
-p --pidfile FILE   write pid to FILE
-r --rundir DIR     switch to DIR before executing the program
-u --user USER      switch to USER before executing the program
-g --group GROUP    switch to GROUP before executing the program

The program's output go to a blackhole if no logfile is set.
Log files are recycled on SIGHUP.

@andrewkroh Same again I'm afraid - just the same GeoIP output in the topbeat-god log.

2016/01/06 23:57:36.192785 geolite.go:24: INFO GeoIP disabled: No paths were set under output.geoip.paths

Are you able to reproduce this at all? Just adding my postinst at the top of this page to the package should do it.

I can reproduce the issue, but I am not sure what is causing it. My log contained a bit more than yours; topbeat got to the point where it said topbeat successfully setup. Start running.. But that is probably just due to timing (the time between the first log line and last was only 3ms).

Okay, good to know it's not just me.

Is this likely to be something you guys keep investigating? I'm all out of ideas here.

What if you modify the start-stop-daemon command to not rely on topbeat-god, like this:

        start-stop-daemon -vvv --start \
                --pidfile $PIDFILE  \
                --background --exec $DAEMON -- $DAEMON_ARGS \
                || return 2

Hi @andrewkroh

I'm getting more output, but still no running process at the end.

2016/01/08 02:35:24.898609 geolite.go:24: INFO GeoIP disabled: No paths were set under output.geoip.paths
2016/01/08 02:35:25.007755 outputs.go:111: INFO Activated logstash as output plugin.
2016/01/08 02:35:25.007903 publish.go:249: INFO Publisher name: civet
2016/01/08 02:35:25.008318 beat.go:107: INFO Init Beat: topbeat; Version: 1.0.1
Starting /usr/bin/topbeat...
Detaching to start /usr/bin/topbeat...done.

Using the changes I suggested I do end up with a running process after performing an apt-get install topbeat.

@andrewkroh I've gone back and built a clean package with your suggested change, and can now get the service to start via apt.

The only issue is now that it takes 30 seconds to stop or restart the service with the init script, when previously it was instant. Do you have any ideas about that?

# time /etc/init.d/topbeat stop

real 0m30.029s
user 0m0.401s
sys 0m1.334s

Thanks for all your help so far.

Since you are modifying the start command, the stop and status commands will need updated.

  • Add -m to the start-stop-daemon command in do_start so that the PID file gets created since topbeat won't do this. Previously topbeat-god did that.
  • Change WRAPPER to DAEMON in the stop and status commands.

@andrewkroh that's it - all working.

Thank you so much for sticking with this and getting me a solution.

I'm just wondering if the issue with no startup when using tobeat-god will be something you guys follow up for a future release? I'd really prefer not to deviate from upstream, as my original intent was to just add a couple of config and startup files, and repackage.

FWIW, if anyone else is following along, also change WRAPPER to DAEMON in the init.d status command.

Thanks again.

I don't think we will be change init scripts due to this issue. But we did prove that we can daemonize a go process without go-daemon by using standard debian tools... so it is an option.

With most distros moving toward the usage of systemd we won't need to use go-daemon at all on those operating systems. We are going to start including a unit file for systemd in our packages.

I don't think start-stop-daemon exists on e.g. CentOS (pre systemd).

I'm filing an issue on GitHub because this is clearly a bug in the init scripts for Debian.

It would be wiser to fix this issue upstream (in your repos). Currently, many SaltStack community formulas can't work out of the box without these changes to the init script.