Kernel Update on Ubuntu 17.10 causes Elasticsearch to panic

Hi,

anybody else seeing this? When we run Elasticsearch 5.3.3 on Ubuntu 17.10 after the most recent kernel update from 4.13.0-25-generic to 4.13.0-31-generic we see kernel oopses, any remedy except downgrading the kernel?

Thanks... Dominik.

ProblemType: KernelOops
Annotation: Your system might become unstable now and might need to be restarted.
Date: Wed Jan 24 08:00:40 2018
Failure: oops
OopsText:
BUG: unable to handle kernel paging request at 00007facb845b240
IP: 0x7facb845b240
PGD 8000000faed9a067
P4D 8000000faed9a067
PUD 1027e5d067
PMD f58bd0067
PTE f9f146025

Oops: 0011 [#4] SMP PTI
Modules linked in: rfcomm vmnet(OE) vmw_vsock_vmci_transport vsock vmw_vmci vmmon(OE) cmac bnep xt_conntrack ip6table_filter ip6table_mangle ip6table_nat nf_conntrack_ipv6 nf_defrag_ipv6 nf_nat_ipv6 nf_nat nf_conntrack libcrc32c ip6_tables nls_iso8859_1 intel_rapl x86_pkg_temp_thermal intel_powerclamp coretemp kvm_intel dell_wmi kvm sparse_keymap wmi_bmof mxm_wmi irqbypass crct10dif_pclmul crc32_pclmul ghash_clmulni_intel snd_hda_codec_hdmi pcbc aesni_intel aes_x86_64 crypto_simd glue_helper cryptd intel_cstate arc4 dell_smm_hwmon dell_laptop dell_smbios dcdbas snd_hda_codec_realtek iwlmvm snd_usb_audio snd_hda_codec_generic mac80211 snd_usbmidi_lib snd_seq_midi snd_hda_intel intel_rapl_perf snd_seq_midi_event snd_rawmidi snd_hda_codec snd_hda_core snd_hwdep iwlwifi joydev snd_pcm serio_raw
snd_seq cfg80211 rtsx_pci_ms uvcvideo memstick snd_seq_device snd_timer videobuf2_vmalloc videobuf2_memops uas videobuf2_v4l2 input_leds videobuf2_core usb_storage snd videodev btusb mei_me nvidia_uvm(POE) hci_uart btrtl media soundcore mei btbcm serdev btqca btintel bluetooth int3403_thermal shpchp acpi_als intel_pch_thermal processor_thermal_device ecdh_generic kfifo_buf intel_soc_dts_iosf int3402_thermal int3400_thermal industrialio dell_smo8800 intel_lpss_acpi wmi int340x_thermal_zone dell_rbtn mac_hid intel_lpss acpi_thermal_rel acpi_pad parport_pc ppdev lp parport ip_tables x_tables autofs4 hid_generic usbhid nvidia_drm(POE) nvidia_modeset(POE) nvidia(POE) drm_kms_helper e1000e syscopyarea sysfillrect sysimgblt ptp fb_sys_fops rtsx_pci_sdmmc psmouse drm pps_core ahci rtsx_pci libahci
pinctrl_sunrisepoint i2c_hid video pinctrl_intel hid
CPU: 7 PID: 6033 Comm: java Tainted: P D OE 4.13.0-31-generic #34-Ubuntu
Hardware name: Dell Inc. Precision 7510/0M1YNP, BIOS 1.14.4 07/28/2017
task: ffff9b66ef05df00 task.stack: ffffbe428a47c000
RIP: 0010:0x7facb845b240
RSP: 0018:ffffbe428a47ff50 EFLAGS: 00010202
RAX: 00000000000003e7 RBX: 0000000000000000 RCX: 00007facf0ddaa49
RDX: 00007facf0def996 RSI: 0000000006987877 RDI: 0000000000000000
RBP: 0000000000000000 R08: 00007face8697730 R09: 000000000000000c
R10: 00007facb845b240 R11: ffff9b66ef05df00 R12: 0000000000000000
R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000
FS: 00007facf18e4700(0000) GS:ffff9b683ddc0000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007facb845b240 CR3: 0000000f5c106006 CR4: 00000000003606e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Call Trace:
? entry_SYSCALL_64_fastpath+0x33/0xa3
Code: Bad RIP value.
RIP: 0x7facb845b240 RSP: ffffbe428a47ff50
CR2: 00007facb845b240
---[ end trace 9c824342d17ab268 ]---

Package: linux-image-4.13.0-31-generic 4.13.0-31.34
SourcePackage: linux
Tags: kernel-oops
Uname: Linux 4.13.0-31-generic x86_64

Same as reported here:

As reported in this issue:

Running Linux with last kernel 4.13.0-25-generic and ES works fine again!

Hi Dominik,

If you do not want to downgrade the kernel you will need to upgrade your Elasticsearch installation.
As I understand releases from version 5.6.4 onwards should not be affected.

Best,

Janko

1 Like

Thanks for clarifying @Janko I missed that.

1 Like

Thanks for the links and suggestions, for us downgrading the kernel on a few affected developer machines is probably easier than upgrading all the elasticsearch installations everywhere.

Is there a list of affected Ubuntu versions? I.e. is it only 17.10 or do you know about other affected releases as well?

We tested 16.04 and it does not seem to be affected.

This is a kernel bug. Upgrading ES might work around this issue (I have not verified), but this is the wrong solution. My point: the kernel has a bad bug, you should drop it immediately, it can’t be trusted.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.