Hi,
I've noticed that my servers - both physical machines and Xen-based VMs, all running a pair of ES nodes (0.19.8) and Ubuntu 12.04 LTS server, in 64-bit and 32-bit flavours - all seem to cause extremely high numbers of context switches and interrupts as per vmstat output. ES performance, though, is fine.
Here's a sample from the Xen VM:
procs -----------memory---------- ---swap-- -----io---- -system-- ----cpu----
r b swpd free buff cache si so bi bo in cs us sy id wa
0 0 5912 187796 40628 312880 0 0 0 0 1948 3261 0 0 100 0
0 0 5912 187920 40628 312884 0 0 0 0 1680 3000 0 0 100 0
0 0 5912 187620 40656 312856 0 0 28 84 2224 2936 7 1 92 0
0 0 5912 187612 40656 312888 0 0 0 64 1839 3043 4 1 95 0
0 0 5912 252364 40656 312888 0 0 0 0 1921 3339 1 1 98 0
(Stopped ES here)
0 0 5912 252364 40656 312888 0 0 0 0 140 152 0 0 100 0
0 0 5912 252636 40656 312856 0 0 0 0 101 148 0 0 100 0
0 0 5912 252636 40656 312856 0 0 0 0 141 181 0 0 100 0
0 0 5912 252636 40656 312856 0 0 0 84 102 154 0 0 100 0
0 0 5912 252636 40656 312856 0 0 0 0 98 155 0 0 100 0
0 0 5912 252760 40656 312856 0 0 0 0 109 159 0 0 100 0
0 0 5912 252760 40656 312856 0 0 0 0 127 181 0 0 100 0
0 0 5912 252760 40656 312856 0 0 0 0 154 197 0 0 100 0
(Restarted it here)
3 0 5912 271392 40824 312996 0 0 320 24 1831 662 15 5 79 0
2 0 5912 222604 40824 313020 32 0 32 212 3968 1226 45 10 45 0
2 0 5912 243700 40824 313052 0 0 0 0 2571 1252 23 5 71 0
This is on a box which is basically idle - no indexing or searching activity. CPU usage is very low, about 1%. System load is high, around 1.1 (on a 4-core VM). The VM above is a small test system, only about 50 docs in ES in one index.
On the VM, ES has a tiny 128MB heap space allocated - on the physical machine, each ES has 6GB (16GB RAM in the machine). Other than the heap space and http/transport ports, all other settings are default.
The physical box exhibits similar characteristics, though, with 31m docs taking about 75GB across 16 indices.
In each case, I'm running 2 nodes on one machine (just a test setup really, to check clustering - prod will be multiple machines) - although knocking out a node only halves the context switches, it doesn't reduce it to 'idle' levels of a couple of hundred.
Performance seems fine, happily. But does this ring any alarm bells for people? Anything I should be worried about? Do other people see this? Or should I just not care?
Here's some system info from the VM:
root@linode-3:/var/log/supervisor# java -version
java version "1.6.0_24"
OpenJDK Runtime Environment (IcedTea6 1.11.3) (6b24-1.11.3-1ubuntu0.12.04.1)
OpenJDK Client VM (build 20.0-b12, mixed mode, sharing)
root@linode-3:/var/log/supervisor# uname -a
Linux linode-3.secondsync.com 3.4.2-linode44 #1 SMP Tue Jun 12 15:04:46 EDT 2012 i686 i686 i386 GNU/Linux
... and from the physical machine:
root@server-2:~# java -version
java version "1.6.0_24"
OpenJDK Runtime Environment (IcedTea6 1.11.3) (6b24-1.11.3-1ubuntu0.12.04.1)
OpenJDK 64-Bit Server VM (build 20.0-b12, mixed mode)
root@server-2:~# uname -a
Linux server-2.secondsync.com 3.2.0-24-generic #37-Ubuntu SMP Wed Apr 25 08:43:22 UTC 2012 x86_64 x86_64 x86_64 GNU/Linux
Cheers,
Dan
Dan Fairs | dan.fairs@gmail.com | @danfairs | www.fezconsulting.com
--