Kubernetes unit tests failing on arm64 and other non-amd64 platforms

I'd like to ask about some test failures and then open an issue if it makes sense.

I'm running the metricbeat unit tests on arm64 and s390x and I'm getting these test failures:

FAIL    github.com/elastic/beats/v7/metricbeat/module/etcd/metrics      0.820s
FAIL    github.com/elastic/beats/v7/metricbeat/module/kubernetes/proxy  0.206s
FAIL    github.com/elastic/beats/v7/metricbeat/module/kubernetes/controllermanager      5.974s
FAIL    github.com/elastic/beats/v7/metricbeat/module/kubernetes/scheduler      0.255s
FAIL    github.com/elastic/beats/v7/metricbeat/module/kubernetes/apiserver      31.262s
FAIL    github.com/elastic/beats/v7/metricbeat/mb/testing/data  68.656s

The failures have the same pattern. The bucket array is missing values which are zero eg.:

diff -u bucket-values-expected.json bucket-values-actual.json 
--- bucket-values-expected.json 2022-12-09 12:05:20.273485393 -0500
+++ bucket-values-actual.json   2022-12-09 12:04:58.043485663 -0500
@@ -31,15 +31,12 @@
             "ns": {
                 "bucket": {
                     "+Inf": 3,
-                    "1000000": 0,
                     "1024000000": 3,
                     "128000000": 3,
                     "16000000": 2,
-                    "2000000": 0,
                     "2048000000": 3,
                     "256000000": 3,
                     "32000000": 2,
-                    "4000000": 0,
                     "4096000000": 3,
                     "512000000": 3,
                     "64000000": 3,

I've traced the issue back to the prometheus code that the kubernetes module uses. There are some casts from float64 NaN and Inf to unit64 and the results are platform dependent in Go. There are a few place where this happens and the code is similar to:

if bucket.GetCumulativeCount() != uint64(math.NaN()) && bucket.GetCumulativeCount() != uint64(math.Inf(0)) { ...save value...}

On amd64, uint64(math.NaN()) is 0x8000000000000000.
On arm64 and s390x, uint64(math.NaN()) is 0 so buckets with value zero end up getting filtered out.

I haven't received any replies to this post but I'd like to open a bug on github. I'll wait another day to see if there is any feedback before doing that.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.