Service pid file is removed


(Wes Plunk) #1

If I have ES started via a service it seems that after a certain amount of
time the file will be seen as stale and removed, even if the ES processes
are still running. Can anyone explain why?

The side affect of this is that after it's removed the ES service doesn't
recognize itself as still running and will (when I submit a start request)
start another node instance on the same server, creating a mess of extra
nodes within the cluster.


(Shay Banon) #2

Are you using the Java service wrapper?

On Wednesday, February 8, 2012 at 5:30 PM, Wes Plunk wrote:

If I have ES started via a service it seems that after a certain amount of time the file will be seen as stale and removed, even if the ES processes are still running. Can anyone explain why?

The side affect of this is that after it's removed the ES service doesn't recognize itself as still running and will (when I submit a start request) start another node instance on the same server, creating a mess of extra nodes within the cluster.


(Wes Plunk) #3

I believe I am. I downloaded/installed the service

and everything seems normal. This problem happened only once so far, so I'm
not convinced it's a big issue or anything; however, I'm primarily trying
to confirm how it should work in regards to stale pid files so that I can
set expectations correctly on how to handle it


(Leif Maxfield) #4

I know that this thread is ancient, but I was experiencing this issue w/
the java service wrapper, so I wanted to document what I found in case
others are seeing the same thing. In my case, it was deleting its own pid
after startup because I am using a symlink for the current version of ES.

In the getpid() function of the service wrapper in the 'else' of the 'case
"$DIST_OS' block (because I am using Linux), i asked it to echo the command
it was using for pidtest:

echo "$PSEXE -p $pid -o args | grep -F '$WRAPPER_CMD' | tail -1"
pidtest=$PSEXE -p $pid -o args | grep -F "$WRAPPER_CMD" | tail -1

The output of a service start looks like this:

$ sudo service elasticsearch start
Starting Elasticsearch...
Waiting for Elasticsearch......
/bin/ps -p 10468 -o args | grep -F
'/es/elasticsearch-1.0.1/bin/service/exec/elasticsearch-linux-x86-64' |
tail -1
running: PID:10468
/bin/ps -p 10468 -o args | grep -F
'/es/live-version/bin/service/exec/elasticsearch-linux-x86-64' | tail -1
Removed stale pid file: /es/work/elasticsearch.pid

I'm not sure why the first call to 'getpid' is using the "true" path while
the second call is using the symlink, but this is causing the service
wrapper to delete its own PID. One way around this, if you want to point
your init script at a symlink, is to comment out all of the "ES_HOME
determination" stuff at the top of the script, and under it just set it
manually:

Just set ES_HOME statically rather than doing anything fancy.

ES_HOME="/es/live-version"

Now, the execution of ES looks like this:

$ sudo service elasticsearch start
Starting Elasticsearch...
Waiting for Elasticsearch......
/bin/ps -p 10744 -o args | grep -F
'/es/live-version/bin/service/exec/elasticsearch-linux-x86-64' | tail -1
running: PID:10744
/bin/ps -p 10744 -o args | grep -F
'/es/live-version/bin/service/exec/elasticsearch-linux-x86-64' | tail -1

And then it doesn't delete its own PID. One downside is you need to edit
the script in this way each time you upgrade.

On Thursday, February 9, 2012 9:43:12 AM UTC-6, Wes Plunk wrote:

I believe I am. I downloaded/installed the service

http://github.com/elasticsearch/elasticsearch-servicewrapper

and everything seems normal. This problem happened only once so far, so
I'm not convinced it's a big issue or anything; however, I'm primarily
trying to confirm how it should work in regards to stale pid files so that
I can set expectations correctly on how to handle it

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/f585d1b4-10aa-48ce-b38c-fa77c1824ca2%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


(Alexander Reelsen) #5

Hey,

can you create a github issue in the servicewrapper repo including an easy
reproduction so this does not get lost in mailinglist traffic? Thanks a lot!

--Alex

On Tue, Mar 11, 2014 at 4:27 PM, Leif Maxfield leif.maxfield@gmail.comwrote:

I know that this thread is ancient, but I was experiencing this issue w/
the java service wrapper, so I wanted to document what I found in case
others are seeing the same thing. In my case, it was deleting its own pid
after startup because I am using a symlink for the current version of ES.

In the getpid() function of the service wrapper in the 'else' of the 'case
"$DIST_OS' block (because I am using Linux), i asked it to echo the
command it was using for pidtest:

echo "$PSEXE -p $pid -o args | grep -F '$WRAPPER_CMD' | tail -1"
pidtest=$PSEXE -p $pid -o args | grep -F "$WRAPPER_CMD" | tail -1

The output of a service start looks like this:

$ sudo service elasticsearch start
Starting Elasticsearch...
Waiting for Elasticsearch......
/bin/ps -p 10468 -o args | grep -F
'/es/elasticsearch-1.0.1/bin/service/exec/elasticsearch-linux-x86-64' |
tail -1
running: PID:10468
/bin/ps -p 10468 -o args | grep -F
'/es/live-version/bin/service/exec/elasticsearch-linux-x86-64' | tail -1
Removed stale pid file: /es/work/elasticsearch.pid

I'm not sure why the first call to 'getpid' is using the "true" path while
the second call is using the symlink, but this is causing the service
wrapper to delete its own PID. One way around this, if you want to point
your init script at a symlink, is to comment out all of the "ES_HOME
determination" stuff at the top of the script, and under it just set it
manually:

Just set ES_HOME statically rather than doing anything fancy.

ES_HOME="/es/live-version"

Now, the execution of ES looks like this:

$ sudo service elasticsearch start
Starting Elasticsearch...
Waiting for Elasticsearch......
/bin/ps -p 10744 -o args | grep -F
'/es/live-version/bin/service/exec/elasticsearch-linux-x86-64' | tail -1
running: PID:10744
/bin/ps -p 10744 -o args | grep -F
'/es/live-version/bin/service/exec/elasticsearch-linux-x86-64' | tail -1

And then it doesn't delete its own PID. One downside is you need to edit
the script in this way each time you upgrade.

On Thursday, February 9, 2012 9:43:12 AM UTC-6, Wes Plunk wrote:

I believe I am. I downloaded/installed the service

http://github.com/elasticsearch/elasticsearch-servicewrapper

and everything seems normal. This problem happened only once so far, so
I'm not convinced it's a big issue or anything; however, I'm primarily
trying to confirm how it should work in regards to stale pid files so that
I can set expectations correctly on how to handle it

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/f585d1b4-10aa-48ce-b38c-fa77c1824ca2%40googlegroups.comhttps://groups.google.com/d/msgid/elasticsearch/f585d1b4-10aa-48ce-b38c-fa77c1824ca2%40googlegroups.com?utm_medium=email&utm_source=footer
.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAGCwEM-hUUzPzDAVTi8xv728Ob89o6ek5RLxPptXZR4mW97fTg%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


(system) #6