Elasticseach 9.4.0 Won't Start on Nehalem CPU "does not support all the following CPU features"

I got an unexpected surprise today when I tried to upgrade my Elasticsearch cluster from 9.3.4 to 9.4.0 on Ubuntu 24.04 LTS using deb packages. The service won't start.

May 07 08:33:52 elk3 systemd[1]: Starting elasticsearch.service - Elasticsearch...
May 07 08:33:53 elk3 systemd-entrypoint[1016]: The current machine does not support all of the following CPU features that are required by the image: [CX8, CMOV, FXSR, MMX, SSE, SSE2, SSE3, SSSE3, SSE4_1, SSE4_2, POPCNT, LZCNT, AVX, AVX2, BMI1, BMI2, FMA, F16C].
May 07 08:33:53 elk3 systemd-entrypoint[1016]: Please rebuild the executable with an appropriate setting of the -march option.
May 07 08:33:53 elk3 systemd[1]: elasticsearch.service: Main process exited, code=exited, status=1/FAILURE
May 07 08:33:53 elk3 systemd[1]: elasticsearch.service: Failed with result 'exit-code'.
May 07 08:33:53 elk3 systemd[1]: Failed to start elasticsearch.service - Elasticsearch.

Granted, my cluster is built out of several old HP ProLiant DL380p Gen8 servers running Intel(R) Xeon(R) CPU E5-2620 (Nahalem) CPUs. But I read through the release notes for 9.4.0 and didn't see anything about changes to the supported CPU architectures. I wasn't sure if this was a compiling oops, or undocumented change to Elasticsearch's requirements

$ /lib64/ld-linux-x86-64.so.2 --help
Usage: /lib64/ld-linux-x86-64.so.2 [OPTION]... EXECUTABLE-FILE [ARGS-FOR-PROGRAM...]
You have invoked 'ld.so', the program interpreter for dynamically-linked
ELF programs.  Usually, the program interpreter is invoked automatically
when a dynamically-linked executable is started.

You may invoke the program interpreter program directly from the command
line to load and run an ELF executable file; this is like executing that
file itself, but always uses the program interpreter you invoked,
instead of the program interpreter specified in the executable file you
run.  Invoking the program interpreter directly provides access to
additional diagnostics, and changing the dynamic linker behavior without
setting environment variables (which would be inherited by subprocesses).

  --list                list all dependencies and how they are resolved
  --verify              verify that given object really is a dynamically linked
                        object we can handle
  --inhibit-cache       Do not use /etc/ld.so.cache
  --library-path PATH   use given PATH instead of content of the environment
                        variable LD_LIBRARY_PATH
  --glibc-hwcaps-prepend LIST
                        search glibc-hwcaps subdirectories in LIST
  --glibc-hwcaps-mask LIST
                        only search built-in subdirectories if in LIST
  --inhibit-rpath LIST  ignore RUNPATH and RPATH information in object names
                        in LIST
  --audit LIST          use objects named in LIST as auditors
  --preload LIST        preload objects named in LIST
  --argv0 STRING        set argv[0] to STRING before running
  --list-tunables       list all tunables with minimum and maximum values
  --list-diagnostics    list diagnostics information
  --help                display this help and exit
  --version             output version information and exit

This program interpreter self-identifies as: /lib64/ld-linux-x86-64.so.2

Shared library search path:
  (libraries located via /etc/ld.so.cache)
  /lib/x86_64-linux-gnu (system search path)
  /usr/lib/x86_64-linux-gnu (system search path)
  /lib (system search path)
  /usr/lib (system search path)

Subdirectories of glibc-hwcaps directories, in priority order:
  x86-64-v4
  x86-64-v3
  x86-64-v2 (supported, searched)
$ cat /proc/cpuinfo
processor       : 0
vendor_id       : GenuineIntel
cpu family      : 6
model           : 45
model name      : Intel(R) Xeon(R) CPU E5-2620 0 @ 2.00GHz
stepping        : 7
microcode       : 0x71a
cpu MHz         : 2500.000
cache size      : 15360 KB
physical id     : 0
siblings        : 12
core id         : 0
cpu cores       : 6
apicid          : 0
initial apicid  : 0
fpu             : yes
fpu_exception   : yes
cpuid level     : 13
wp              : yes
flags           : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx pdpe1gb rdtscp lm constant_tsc arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc cpuid aperfmperf pni pclmulqdq dtes64 monitor ds_cpl vmx smx est tm2 ssse3 cx16 xtpr pdcm pcid dca sse4_1 sse4_2 x2apic popcnt tsc_deadline_timer aes xsave avx lahf_lm epb pti ssbd ibrs ibpb stibp tpr_shadow flexpriority ept vpid xsaveopt dtherm ida arat pln pts vnmi md_clear flush_l1d ibpb_exit_to_user
vmx flags       : vnmi preemption_timer invvpid ept_x_only ept_1gb flexpriority tsc_offset vtpr mtf vapic ept vpid unrestricted_guest ple
bugs            : cpu_meltdown spectre_v1 spectre_v2 spec_store_bypass l1tf mds swapgs itlb_multihit mmio_unknown vmscape
bogomips        : 3990.39
clflush size    : 64
cache_alignment : 64
address sizes   : 46 bits physical, 48 bits virtual
power management:

For comparison, I spun up a quick Hyper-V VM on a slightly newer Dell R730 server running Intel(R) Xeon(R) CPU E5-2670 v3 (Haswell) CPUs and Elasticsearch 9.4.0 started without issue.

/lib64/ld-linux-x86-64.so.2 --help
Usage: /lib64/ld-linux-x86-64.so.2 [OPTION]... EXECUTABLE-FILE [ARGS-FOR-PROGRAM...]
You have invoked 'ld.so', the program interpreter for dynamically-linked
ELF programs.  Usually, the program interpreter is invoked automatically
when a dynamically-linked executable is started.

You may invoke the program interpreter program directly from the command
line to load and run an ELF executable file; this is like executing that
file itself, but always uses the program interpreter you invoked,
instead of the program interpreter specified in the executable file you
run.  Invoking the program interpreter directly provides access to
additional diagnostics, and changing the dynamic linker behavior without
setting environment variables (which would be inherited by subprocesses).

  --list                list all dependencies and how they are resolved
  --verify              verify that given object really is a dynamically linked
                        object we can handle
  --inhibit-cache       Do not use /etc/ld.so.cache
  --library-path PATH   use given PATH instead of content of the environment
                        variable LD_LIBRARY_PATH
  --glibc-hwcaps-prepend LIST
                        search glibc-hwcaps subdirectories in LIST
  --glibc-hwcaps-mask LIST
                        only search built-in subdirectories if in LIST
  --inhibit-rpath LIST  ignore RUNPATH and RPATH information in object names
                        in LIST
  --audit LIST          use objects named in LIST as auditors
  --preload LIST        preload objects named in LIST
  --argv0 STRING        set argv[0] to STRING before running
  --list-tunables       list all tunables with minimum and maximum values
  --list-diagnostics    list diagnostics information
  --help                display this help and exit
  --version             output version information and exit

This program interpreter self-identifies as: /lib64/ld-linux-x86-64.so.2

Shared library search path:
  (libraries located via /etc/ld.so.cache)
  /lib/x86_64-linux-gnu (system search path)
  /usr/lib/x86_64-linux-gnu (system search path)
  /lib (system search path)
  /usr/lib (system search path)

Subdirectories of glibc-hwcaps directories, in priority order:
  x86-64-v4
  x86-64-v3 (supported, searched)
  x86-64-v2 (supported, searched)

$ cat /proc/cpuinfo
processor       : 0
vendor_id       : GenuineIntel
cpu family      : 6
model           : 63
model name      : Intel(R) Xeon(R) CPU E5-2670 v3 @ 2.30GHz
stepping        : 2
microcode       : 0xffffffff
cpu MHz         : 1200.000
cache size      : 30720 KB
physical id     : 0
siblings        : 8
core id         : 0
cpu cores       : 4
apicid          : 0
initial apicid  : 0
fpu             : yes
fpu_exception   : yes
cpuid level     : 15
wp              : yes
flags           : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ss ht syscall nx pdpe1gb rdtscp lm constant_tsc rep_good nopl xtopology cpuid aperfmperf tsc_known_freq pni pclmulqdq ssse3 fma cx16 pcid sse4_1 sse4_2 movbe popcnt aes xsave avx f16c rdrand hypervisor lahf_lm abm pti ssbd ibrs ibpb stibp fsgsbase bmi1 avx2 smep bmi2 erms invpcid xsaveopt md_clear flush_l1d arch_capabilities
bugs            : cpu_meltdown spectre_v1 spectre_v2 spec_store_bypass l1tf mds swapgs itlb_multihit mmio_stale_data bhi its
bogomips        : 4589.37
clflush size    : 64
cache_alignment : 64
address sizes   : 46 bits physical, 48 bits virtual
power management:

Based on the comparing the flags, the Haswell CPU supports AVX2, BMI1, BMI2, FMA, and F16C while the Nahalem CPU doesn't.

A Github issue was opened May 5th for the same issue. It looks like the cause of the issue has been determined and a pull request exists to fix it. Don't know if there is anything that can be done to get 9.0.4 running, :crossed_fingers:the pull request is merged for the next release.

For info, a workaround is suggested in the pull request page (added minutes ago).

Worth a try for those impacted.

The fix has been merged. Some details from the tracked issue in Github

The fix should go out in the 9.4.1 patch (and is included in the 9.5.0 release): #148542

Until the patched release ships, you can force the launcher to take the JVM fallback path:

sudo systemctl stop elasticsearch
sudo chmod -x /usr/share/elasticsearch/lib/tools/server-launcher/server-launcher
sudo systemctl start elasticsearch