Previously stable Packer builds for Kibana now failing silently to run the service returning 503s only

I've got a simple setup with Jenkins using Packer to pull down, install, and configure Kibana. The base code I used can be found here: https://github.com/synhershko/elasticsearch-cloud-deploy/

This has actually been running happily for a few weeks now.

My most recent builds started failing to respond upon startup. Investigation shows that the apt-get install kibana command is completing successfully but sudo service kibana start runs with zero output or logs, /var/log/kibana does not exist.

I've tried every way I can find to start the service and I'm getting silence back.

Any ideas on how I can debug this?

When I'm debugging stuff like this I usually start with service kibana status, can you share the output of that command?

If that doesn't help then I'll go to journalctl -u kibana.service, if that still shows nothing I'll inspect the service file and see how it is starting the service, then try to do that myself. You should be able to find the location of the service file in the output from service kibana status, and then from there figure out what directory it's running what in.

Thanks for that command, it would have saved me some time earlier when I was going through the services by hand.

In short: Its Active, and yet not responsive.

service kibana status
● kibana.service - Kibana
   Loaded: loaded (/etc/systemd/system/kibana.service; enabled; vendor preset: enabled)
   Active: active (running) since Sat 2018-12-22 01:10:07 UTC; 12min ago
 Main PID: 10303 (node)
    Tasks: 10
   Memory: 1.4G
      CPU: 15min 16.761s
   CGroup: /system.slice/kibana.service
           └─10303 /usr/share/kibana/bin/../node/bin/node --no-warnings /usr/share/kibana/bin/../src/cli -c /etc/kibana/kibana.yml

Also, I really appreciate the help!

I just noticed I missed some of the (very interesting) output.

Dec 22 01:10:07 ip-172-20-131-143 systemd[1]: Started Kibana.
Dec 22 01:10:23 ip-172-20-131-143 kibana[10303]: {"type":"log","@timestamp":"2018-12-22T01:10:23Z","tags":["info","optimize"],"pid":10303,"message":"Optimizing and caching bundles for ml, stateSessionStorageRedirect, status_page, timelion, graph, monitoring, space_selector, dashboardViewer, apm, canvas, infra and kibana. This may take a few minutes"}

Ah, yes, that is going to take a while to run. It shouldn't be necessary unless you have disabled or installed plugins as a part of your config, but that process can take several minutes to run. On the plus side, it indicates that Kibana is actually running and writing logs!

Yeah that seems like good news but... the bad news is this machine has been up for a while now and shows no sign of being healthy. Its been running for at least 4 hours.

Well hey, look at that, it took a while but that log has much more information now:

I'm guessing my instance sizing is wrong?

kibana.service: Main process exited, code=dumped, status=6/ABRT
kibana.service: Unit entered failed state.
kibana.service: Failed with result 'core-dump'.
kibana.service: Service hold-off time over, scheduling restart.
Stopped Kibana.
Started Kibana.
{"type":"log","@timestamp":"2018-12-21T00:49:56Z","tags":["info","optimize"],"pid":3993,"message":"Optimizing and caching bundles for ml, stateSessionStorageRedirect, status_page,
<--- Last few GCs --->
[3993:0x2745100]  1178550 ms: Mark-sweep 1310.7 (1437.6) -> 1310.7 (1437.6) MB, 1815.8 / 0.0 ms  allocation failure GC in old space requested
[3993:0x2745100]  1180384 ms: Mark-sweep 1310.7 (1437.6) -> 1310.7 (1420.1) MB, 1834.1 / 0.0 ms  last resort GC in old space requested
[3993:0x2745100]  1182209 ms: Mark-sweep 1310.7 (1420.1) -> 1310.7 (1420.1) MB, 1824.3 / 0.0 ms  last resort GC in old space requested
<--- JS stacktrace --->
==== JS stack trace =========================================
Security context: 0x2bf41cea58b9 <JSObject>
    0: builtin exit frame: lastIndexOf(this=0x35043c7fc249 <Very long string[5033758]>,0x20dc382920b9 <String[1]\: \n>)
    1: has_nlb(aka has_nlb) [0x363d37f022d1 <undefined>:5970] [bytecode=0x2bbfa86749b9 offset=15](this=0x363d37f022d1 <undefined>)
    2: /* anonymous */(aka /* anonymous */) [0x363d37f022d1 <undefined>:6070] [bytecode=0x17ada5067c9 offset=60](this=0x363d37f022d1 <undefined>,c...
FATAL ERROR: CALL_AND_RETRY_LAST Allocation failed - JavaScript heap out of memory
 1: node::Abort() [/usr/share/kibana/bin/../node/bin/node]
 2: 0x8cce9c [/usr/share/kibana/bin/../node/bin/node]
 3: v8::Utils::ReportOOMFailure(char const*, bool) [/usr/share/kibana/bin/../node/bin/node]
 4: v8::internal::V8::FatalProcessOutOfMemory(char const*, bool) [/usr/share/kibana/bin/../node/bin/node]
 5: v8::internal::Factory::NewRawTwoByteString(int, v8::internal::PretenureFlag) [/usr/share/kibana/bin/../node/bin/node]
 6: v8::internal::String::SlowFlatten(v8::internal::Handle<v8::internal::ConsString>, v8::internal::PretenureFlag) [/usr/share/kibana/bin/../node/bin/node]
 7: v8::internal::String::Flatten(v8::internal::Handle<v8::internal::String>, v8::internal::PretenureFlag) [/usr/share/kibana/bin/../node/bin/node]
 8: v8::internal::String::LastIndexOf(v8::internal::Isolate*, v8::internal::Handle<v8::internal::Object>, v8::internal::Handle<v8::internal::Object>, v8::internal::Handle<v8::inter
 9: v8::internal::Builtin_StringPrototypeLastIndexOf(int, v8::internal::Object**, v8::internal::Isolate*) [/usr/share/kibana/bin/../node/bin/node]

Oh, found something: https://github.com/elastic/kibana/issues/15683

Looks like the setup is hitting a memory allocation ceiling and dying? We'll see if that flag works: NODE_OPTIONS="--max-old-space-size=4096"

Okay, it took me a lot longer than I'd like to admit but I've finally debugged the solution to this. Here's what I had to add to my startup script:


sudo echo "MAX_LOCKED_MEMORY=unlimited" >> /etc/default/kibana
sudo echo "NODE_OPTIONS=\"--max-old-space-size=4096\"" >> /etc/default/kibana

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.