Segmentation violation and other errors with Go 1.13.6 - Beats 7.5.1

Hi there,

I am in the process of porting Beats on s390x architecture and noticed that there are multiple segmentation violation when running test cases (Most of the beats - for example Filebeat, Packetbeat, Heartbeat has this error).

This also happens on x86 arch (Linux 2a4cae7a4792 4.15.0-29-generic #31-Ubuntu SMP Tue Jul 17 15:39:52 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux)

/============================================/
...
Testing Heartbeat
go test -i github.com/elastic/beats/heartbeat github.com/elastic/beats/heartbeat/autodiscover github.com/elastic/beats/heartbeat/autodiscover/builder/hints github.com/elastic/beats/heartbeat/beater github.com/elastic/beats/heartbeat/cmd github.com/elastic/beats/heartbeat/config github.com/elastic/beats/heartbeat/eventext github.com/elastic/beats/heartbeat/hbtest github.com/elastic/beats/heartbeat/include github.com/elastic/beats/heartbeat/look github.com/elastic/beats/heartbeat/monitors github.com/elastic/beats/heartbeat/monitors/active/dialchain github.com/elastic/beats/heartbeat/monitors/active/http github.com/elastic/beats/heartbeat/monitors/active/icmp github.com/elastic/beats/heartbeat/monitors/active/tcp github.com/elastic/beats/heartbeat/monitors/defaults github.com/elastic/beats/heartbeat/monitors/jobs github.com/elastic/beats/heartbeat/monitors/wrappers github.com/elastic/beats/heartbeat/reason github.com/elastic/beats/heartbeat/scheduler github.com/elastic/beats/heartbeat/scheduler/schedule github.com/elastic/beats/heartbeat/scheduler/schedule/cron github.com/elastic/beats/heartbeat/scripts/mage github.com/elastic/beats/heartbeat/watcher
go test github.com/elastic/beats/heartbeat github.com/elastic/beats/heartbeat/autodiscover github.com/elastic/beats/heartbeat/autodiscover/builder/hints github.com/elastic/beats/heartbeat/beater github.com/elastic/beats/heartbeat/cmd github.com/elastic/beats/heartbeat/config github.com/elastic/beats/heartbeat/eventext github.com/elastic/beats/heartbeat/hbtest github.com/elastic/beats/heartbeat/include github.com/elastic/beats/heartbeat/look github.com/elastic/beats/heartbeat/monitors github.com/elastic/beats/heartbeat/monitors/active/dialchain github.com/elastic/beats/heartbeat/monitors/active/http github.com/elastic/beats/heartbeat/monitors/active/icmp github.com/elastic/beats/heartbeat/monitors/active/tcp github.com/elastic/beats/heartbeat/monitors/defaults github.com/elastic/beats/heartbeat/monitors/jobs github.com/elastic/beats/heartbeat/monitors/wrappers github.com/elastic/beats/heartbeat/reason github.com/elastic/beats/heartbeat/scheduler github.com/elastic/beats/heartbeat/scheduler/schedule github.com/elastic/beats/heartbeat/scheduler/schedule/cron github.com/elastic/beats/heartbeat/scripts/mage github.com/elastic/beats/heartbeat/watcher
panic: runtime error: invalid memory address or nil pointer dereference
[signal SIGSEGV: segmentation violation code=0x1 addr=0x8 pc=0x202aff6]

...

FAIL: Test JSON response with simple straight-forward comparisons [with body='{"foo": 3}', comparison={'foo': 3}]

Traceback (most recent call last):
File "//src/github.com/elastic/beats/heartbeat/build/python-env/local/lib/python2.7/site-packages/parameterized/parameterized.py", line 392, in standalone_func
return func(*(a + p.args), **p.kwargs)
File "/src/github.com/elastic/beats/heartbeat/tests/system/test_monitor.py", line 165, in test_json_simple_comparisons
proc.check_kill_and_wait()
File "/src/github.com/elastic/beats/heartbeat/tests/system/../../../libbeat/tests/system/beat/beat.py", line 104, in check_kill_and_wait
return self.check_wait(exit_code=exit_code)
File "/src/github.com/elastic/beats/heartbeat/tests/system/../../../libbeat/tests/system/beat/beat.py", line 93, in check_wait
exit_code, actual_exit_code)
AssertionError: Expected exit code to be 0, but it was 2

/============================================/

Upon changing Go version to 1.12.9 everything seems to work pretty well (at least most of the test cases pass).

I wonder if these are known issues.

Thanks for any guidance.

Hmmm, this is a bit odd, we do run our internal builds on 1.13.6, and they do pass consistently on linux.

Out of curiosity, what happens with 1.13.6 when running mage goTestUnit?

Same error:

/=============================/
root@2a4cae7a4792://src/github.com/elastic/beats# mage goTestUnit
2020/01/15 18:12:56 Found Elastic Beats dir at /src/github.com/elastic/beats
Unknown target specified: goTestUnit
root@2a4cae7a4792://src/github.com/elastic/beats# cd heartbeat/
root@2a4cae7a4792://src/github.com/elastic/beats/heartbeat# mage goTestUnit
2020/01/15 18:13:39 Found Elastic Beats dir at /src/github.com/elastic/beats

go test: Unit Testing
FAILURES:
Package: github.com/elastic/beats/heartbeat
Test: Failure
panic: runtime error: invalid memory address or nil pointer dereference
[signal SIGSEGV: segmentation violation code=0x1 addr=0x8 pc=0x202aff6]
goroutine 1 [running]:
github.com/elastic/beats/vendor/github.com/spf13/pflag.(*FlagSet).AddGoFlag(0xc00021c870, 0x0)
github.com/elastic/beats/heartbeat.init.0()


SUMMARY:
Fail: 1
Skip: 0
Pass: 223
Packages: 11
Duration: 34.266046999s
JUnit Report: /src/github.com/elastic/beats/heartbeat/build/TEST-go-unit.xml
Output File: /src/github.com/elastic/beats/heartbeat/build/TEST-go-unit.out

go test: Unit Test Failed
Error: go test failed: 1 test failures

/=============================/

I don't think it matters but just FYI: I am running this inside a Docker container launched using docker run --rm -it ubuntu:18.04 bash. I did the same test in another container using Go 1.12.9 and it worked fine.

Something weird is going on here, the tests for the beats codebase are run daily on my (linux x86_64) laptop, our CI servers, and tons of developers and others. I never hear about segfaults as an issue.

I suspect there's something about your environment that's off but it's really hard to say what it is. Have you tried building it in an other environment?

Yes - I tried in couple of envt with similar results. Wonder if it is related to : https://github.com/elastic/beats/issues/13910?

Which branch are you on? Have you tried this on master? There may be some issues with your setup on 7.5 I hear.

And to confirm, this is all on x86_64 and s390x where these issues are occurring right?

I have been using 7.5.1 branch where this problem occurs.

Master works just fine with Go 1.13.6

In the link I mentioned above there are some workarounds mentioned which fixes the issue (adding testing.Init() in the code which has made its way to the master branch). Wonder if 7.5.1 docs can be changed to reflect this behavior - i.e., use Go 1.12.9 with 7.5.1?

Yes.