I've had my application running in PROD for months, and suddenly yesterday
my server got stuck and since then I've started to notice a very weird
behaviour monitoring the ES process. I know this may sound strange, but this
is exactly the description of what's happening now (apologize if I say
something with no sense, but I'm desperated right now):
- I restart the server and start up a Tomcat webapp application and ES
server (both in same machine).
- Tomcat is configured to use 1536Mb, whereas ES uses 900Mb. This is a
CentOS5.5. server with 3Gb of physical RAM.
- If I run a search in the webapp application, which queries internally
using a Transport client the ES server, then I can see the CPU (using 'top')
for the ES Java process growing to 99.9%, then maybe falling to 67%, then
again to 98.8%, etc., for around 10 seconds, and finally my webapp replies
displaying the results.
- After this very first query, I monitor the CPU for the ES Java process and
I can see sort of a constant repetitive series of 0% -> 5% -> 15% -> 0% ->
5% -> 15% -> etc, without making any request through the web site.
- From now, if I execute the queries direcly in the server using 'curl',
they respond in ms, but if I run the same query through the web, then I see
the ES Java process grow up to 99% as explained before for several seconds
(so this discards the possibility of Tomcat Java process eating the CPU).
Has anyone experienced this or similar issues? Is this an OS issue? How can
it be that it stops working properly suddenly, when no new data has been
added or removed? What can I check?