Fatal error: runtime: cannot map pages in arena address space

Hi all,

I am getting the following error on windows servers, when the filebeat starts.

fatal error: runtime: cannot map pages in arena address space

Filebeat is 32bit
Any idea, how to solve ?

Thanks,

Ori

Which Windows release?

Windows 2008 R2.

This is an upgrade attempt from Filebeat Version 1.0.1 which was running without any issues.
First tried to upgrade to version 1.1.2, with no success, many stability issues.

Can you share some more details of the error message? Also what were the stability issues you were having? Best is to upgrade to filebeat 1.2.3.

I can upload the trace file.
Will it help ?

I ran:

filebeat.exe -c filebeat.yml 1>1.txt 2>2.txt

Got the errors written to the stderr into 2.txt.

Uploading to trace file should help. You can also share it as a gist. Running filebeat with -e -d "*" should provide you more debug information on what is happening.

Hi,

I do not know how to share as gist.
Cannot upload it as a zip file.
Is there any other way I can send it ?

Ori

Seems to be this function failing.

Have you got the message printed by:

print("runtime: VirtualAlloc of ", n, " bytes failed with errno=", getlasterror(), "\n")

?

Error can occur if VirtualAlloc fails when runtime tries to allocate some more actual pages for reserved virtual address space.

Nope.

Cannot find it in the trace file.

The message is:
fatal error: runtime: cannot map pages in arena address space

And then there is a stack trace:

fatal error: runtime: cannot map pages in arena address space

runtime stack:
runtime.throw(0x8926e0, 0x30)
/usr/local/go/src/runtime/panic.go:527 +0x7f fp=0x443efb68 sp=0x443efb5c
runtime.sysMap(0x0, 0x100000, 0x443efb01, 0xadbf78)
/usr/local/go/src/runtime/mem_windows.go:112 +0x78 fp=0x443efb84 sp=0x443efb68
runtime.mHeap_SysAlloc(0xac9e60, 0x100000, 0x1)
/usr/local/go/src/runtime/malloc.go:423 +0x121 fp=0x443efbd0 sp=0x443efb84
runtime.mHeap_Grow(0xac9e60, 0x8, 0x0)
/usr/local/go/src/runtime/mheap.go:628 +0x52 fp=0x443efc04 sp=0x443efbd0
runtime.mHeap_AllocSpanLocked(0xac9e60, 0x4, 0x1)
/usr/local/go/src/runtime/mheap.go:532 +0x574 fp=0x443efc3c sp=0x443efc04
runtime.mHeap_AllocStack(0xac9e60, 0x4, 0x32a38f01)
/usr/local/go/src/runtime/mheap.go:502 +0x6f fp=0x443efc50 sp=0x443efc3c
runtime.stackpoolalloc(0xada300, 0x3)
/usr/local/go/src/runtime/stack1.go:71 +0x55 fp=0x443efc70 sp=0x443efc50
runtime.stackalloc(0x1000, 0x0, 0x0, 0x0, 0x0, 0x0)
/usr/local/go/src/runtime/stack1.go:243 +0x29d fp=0x443efcb8 sp=0x443efc70
runtime.copystack(0x275deee0, 0x1000)
/usr/local/go/src/runtime/stack1.go:604 +0x8b fp=0x443efd30 sp=0x443efcb8
runtime.shrinkstack(0x275deee0)
/usr/local/go/src/runtime/stack1.go:880 +0x122 fp=0x443efd48 sp=0x443efd30
runtime.markroot(0x12a6c140, 0x2512)
/usr/local/go/src/runtime/mgcmark.go:130 +0x158 fp=0x443efd9c sp=0x443efd48
runtime.parfordo(0x12a6c140)
/usr/local/go/src/runtime/parfor.go:110 +0x186 fp=0x443efe00 sp=0x443efd9c
runtime.gchelper()
/usr/local/go/src/runtime/mgc.go:1701 +0x5c fp=0x443efe2c sp=0x443efe00
runtime.stopm()
/usr/local/go/src/runtime/proc1.go:1131 +0x113 fp=0x443efe40 sp=0x443efe2c
runtime.gcstopm()
/usr/local/go/src/runtime/proc1.go:1321 +0xe1 fp=0x443efe58 sp=0x443efe40
runtime.schedule()
/usr/local/go/src/runtime/proc1.go:1596 +0x90 fp=0x443efe74 sp=0x443efe58
runtime.goschedImpl(0x130a8540)
/usr/local/go/src/runtime/proc1.go:1713 +0xfa fp=0x443efe84 sp=0x443efe74
runtime.gopreempt_m(0x130a8540)
/usr/local/go/src/runtime/proc1.go:1728 +0x32 fp=0x443efe8c sp=0x443efe84
runtime.newstack()
/usr/local/go/src/runtime/stack1.go:786 +0x91b fp=0x443eff6c sp=0x443efe8c
runtime.morestack()
/usr/local/go/src/runtime/asm_386.s:336 +0x6f fp=0x443eff70 sp=0x443eff6c

goroutine 1 [chan receive]:
runtime.gopark(0x8cc9c0, 0x12c26fb0, 0x833d70, 0xc, 0x16076517, 0x3)
/usr/local/go/src/runtime/proc.go:185 +0x12d fp=0x274aca50 sp=0x274aca3c
runtime.goparkunlock(0x12c26fb0, 0x833d70, 0xc, 0x40f417, 0x3)
/usr/local/go/src/runtime/proc.go:191 +0x4b fp=0x274aca6c sp=0x274aca50
runtime.chanrecv(0x734960, 0x12c26f80, 0x274acb24, 0x1, 0x0)
/usr/local/go/src/runtime/chan.go:448 +0x3bd fp=0x274acad4 sp=0x274aca6c
runtime.chanrecv2(0x734960, 0x12c26f80, 0x274acb24, 0x23)
/usr/local/go/src/runtime/chan.go:354 +0x24 fp=0x274acaec sp=0x274acad4
github.com/elastic/beats/filebeat/crawler.(*Crawler).Start(0x274ace70, 0x12c8c000, 0x15, 0x15, 0x12a64240)
/go/src/github.com/elastic/beats/filebeat/crawler/crawler.go:60 +0x4df fp=0x274acdf8 sp=0x274acaec
github.com/elastic/beats/filebeat/beat.(*Filebeat).Run(0x12a69b20, 0x12a62930, 0x0, 0x0)
/go/src/github.com/elastic/beats/filebeat/beat/filebeat.go:92 +0x3ad fp=0x274ace98 sp=0x274acdf8
github.com/elastic/beats/libbeat/beat.(*Beat).Run(0x12a62930)
/go/src/github.com/elastic/beats/libbeat/beat/beat.go:176 +0x250 fp=0x274acf00 sp=0x274ace98
github.com/elastic/beats/libbeat/beat.Run(0x82a430, 0x8, 0x802fb8, 0x5, 0x3548d8, 0x12a69b20, 0x12a80700)
/go/src/github.com/elastic/beats/libbeat/beat/beat.go:86 +0x1c1 fp=0x274acf84 sp=0x274acf00
main.main()
/go/src/github.com/elastic/beats/filebeat/main.go:21 +0x86 fp=0x274acfb0 sp=0x274acf84
runtime.main()
/usr/local/go/src/runtime/proc.go:111 +0x234 fp=0x274acfd8 sp=0x274acfb0
runtime.goexit()
/usr/local/go/src/runtime/asm_386.s:1662 +0x1 fp=0x274acfdc sp=0x274acfd8

goroutine 2 [force gc (idle)]:
runtime.gopark(0x8cc9c0, 0xac1700, 0x8349a0, 0xf, 0x1414, 0x1)
/usr/local/go/src/runtime/proc.go:185 +0x12d fp=0x12a7cfa4 sp=0x12a7cf90
runtime.goparkunlock(0xac1700, 0x8349a0, 0xf, 0x12a76014, 0x1)
/usr/local/go/src/runtime/proc.go:191 +0x4b fp=0x12a7cfc0 sp=0x12a7cfa4
runtime.forcegchelper()
/usr/local/go/src/runtime/proc.go:152 +0xaa fp=0x12a7cfd8 sp=0x12a7cfc0
runtime.goexit()
/usr/local/go/src/runtime/asm_386.s:1662 +0x1 fp=0x12a7cfdc sp=0x12a7cfd8
created by runtime.init.4
/usr/local/go/src/runtime/proc.go:141 +0x2a

And many many more......

Ori

Argh... wrong branch. The 1.5 release branch doesn't print the reason. Go 1.6 runtime adds the reason message (filebeat 5.0-alpha2/3 might print the reason). Most likely it's an out of memory error though (not enough pages available either in RAM or swap). Not having used windows for years, I don't know much about windows memory management though.

When I run the filebeat with -e -d "*"
I have one server which writes nothing to the STDERR, and another one which keeps writing all the time the debug messages.

Command:
filebeat.exe -c filebeat.yml -e -d "*" 1>c:\programdata\filebeat\logs\1.txt 2>c:\programdata\filebeat\logs\2.txt

File 2.txt is being written to in server1 and keeps being empty in server2.

How can there be such a difference ?

Ori

Currently looking at the log files. Some things I realised:

  • You are using Filebeat 1.2.2. I strongly recommend to update to 1.2.3
  • Can you share your config file? It seems like you have quite a few prospectors.

@andrewkroh @steffens There are some interesting things in the log files. The first few messages here:

runtime: failed to create new OS thread (have 3055 already; errno=8)
fatal error: runtime.newosproc
runtime: failed to create new OS thread (have 3055 already; errno=8)
fatal error: runtime.newosproc

runtime stack:
runtime.throw(0x8452f0, 0x11)
	/usr/local/go/src/runtime/panic.go:527 +0x7f
runtime.newosproc(0x1cb87080, 0x0)
	/usr/local/go/src/runtime/os1_windows.go:278 +0x128
runtime.newm(0x0, 0x12760a00)
	/usr/local/go/src/runtime/proc1.go:1105 +0xf0
runtime.startm(0x12760a00, 0x0)
	/usr/local/go/src/runtime/proc1.go:1178 +0xd1
runtime.handoffp(0x12760a00)
	/usr/local/go/src/runtime/proc1.go:1201 +0x27c
runtime.retake(0x50595fa0, 0x118dd0, 0x0)
	/usr/local/go/src/runtime/proc1.go:3104 +0x1e4
runtime.sysmon()
	/usr/local/go/src/runtime/proc1.go:3031 +0x25c
runtime.mstart1()
	/usr/local/go/src/runtime/proc1.go:721 +0xc9
runtime.mstart()
	/usr/local/go/src/runtime/proc1.go:691 +0x59

goroutine 1 [runnable]:
github.com/elastic/beats/filebeat/crawler.(*Crawler).Start(0x192ede68, 0x1296c000, 0x15, 0x15, 0x127566c0)
	/go/src/github.com/elastic/beats/filebeat/crawler/crawler.go:60 +0x4df
github.com/elastic/beats/filebeat/beat.(*Filebeat).Run(0x12759b70, 0x127528d0, 0x0, 0x0)
	/go/src/github.com/elastic/beats/filebeat/beat/filebeat.go:92 +0x3ad
github.com/elastic/beats/libbeat/beat.(*Beat).Run(0x127528d0)
	/go/src/github.com/elastic/beats/libbeat/beat/beat.go:176 +0x250
github.com/elastic/beats/libbeat/beat.Run(0x82a430, 0x8, 0x802fb8, 0x5, 0x327648d8, 0x12759b70, 0x1276e700)
	/go/src/github.com/elastic/beats/libbeat/beat/beat.go:86 +0x1c1
main.main()
	/go/src/github.com/elastic/beats/filebeat/main.go:21 +0x86

goroutine 5 [syscall]:
os/signal.loop()
	/usr/local/go/src/os/signal/signal_unix.go:22 +0x1a
created by os/signal.init.1
	/usr/local/go/src/os/signal/signal_unix.go:28 +0x36

@ori.rubinfeld You posted this issue also here with the above messages: Filebeat crashes on windows servers Are these 2 threads for the same issue?

It seems that everything runs fine until the interesting limit of 3054 threads is hit.

I found some notes on max number of threads on windows 32bit here: https://blogs.msdn.microsoft.com/oldnewthing/20050729-14/?p=34773/

@ori.rubinfeld What is the total number of files that are harvested by filebeat?

The first thread is when I tried to use filebeat 1.1.2

I am currently testing Filebeat version 1.2.3

In my config file, I have about 15 propectors.
I think the problem occured due to bugs with the ignore_older parameter for version 1.1.2 and 1.2.2

Will update.

Ok, keep us posted. The number of prospectors should be less an issue, more how many files in total are discovered by all prospectors together.

Hi,

Sorry for the delay....
It seems that version 1.2.3 of Filebeat is stable.

Thanks,

Ori

Hi Ruflin,

Eventually, same message appeared also for Version 1.2.3

fatal error: runtime: cannot map pages in arena address space

I think it might be related to the filebeat registry file size.
It happens after the file get close to 70MB.
Had it for 3 different servers.
After deleting the registry file and restarting the service, it continues to work fine.

Ori

How many open files do you have at the time of reaching 70MB?

I am not sure, How can it be checked in Windows OS ?
I have a close_older parameter of 5 minutes.

Ori