I have a 64-core dedicated machine which in order to use, I must allocate a virtualized instance.
I have the option to virtualize with anything from 1 to 240 vCPUs. Perhaps naively, I decide to allocate the maximum of 240. No other virtualized instances are running on this hardware.
I set up elasticsearch on this instance and move traffic over from an older, high-traffic server which we are replacing.
Elasticsearch works normal at first but CPU usage and system load gradually increase until the entire instance becomes unresponsive. (Not just elasticsearch--the entire machine goes unresponsive.) The machine regains responsiveness after a minute or so, works for a few minutes, goes unresponsive again, and continues to cycle back and forth.
I've investigated the usual stuff, swapping, file descriptors, etc. Garbage collection looks normal. So I'm running out of ideas.
Is it possible -- am I using too many vCPU?