Unable to Run Several of the Tracks do to permissions errors

When running a few of the tracks, I get the following error trying to read/write to /proc/pid/io.

ES Rally is running on Amazon Linux on a t2.medium. Any ideas?

[ec2-user@ip-XXX-XXX-XXX-XXX /]$ esrally --track=geopoint --distribution-version=5.5.0

____        ____

/ __ ____ / / / __
/ // / __ `/ / / / / /
/ , / // / / / // /
/ ||_,///_, /

[INFO] Writing logs to /home/ec2-user/.rally/logs/rally_out_20171020T190634Z.log
[INFO] Rally will delete the benchmark candidate after the benchmark
[INFO] Racing on track [geopoint], challenge [append-no-conflicts] and car ['defaults']

[ERROR] Cannot race. (psutil.AccessDenied (pid=10135), 'Traceback (most recent call last):\n File "/usr/local/lib64/python3.4/site-packages/psutil/_pslinux.py", line 899, in wrapper\n return fun(self, *args, **kwargs)\n File "/usr/local/lib64/python3.4/site-packages/psutil/_pslinux.py", line 980, in io_counters\n with open_binary(fname) as f:\n File "/usr/local/lib64/python3.4/site-packages/psutil/_pslinux.py", line 141, in open_binary\n return open(fname, "rb", **kwargs)\nPermissionError: [Errno 13] Permission denied: '/proc/10135/io'\n\nDuring handling of the above exception, another exception occurred:\n\nTraceback (most recent call last):\n File "/usr/local/lib/python3.4/site-packages/esrally/mechanic/mechanic.py", line 457, in receiveMessage\n self.mechanic.on_benchmark_start()\n File "/usr/local/lib/python3.4/site-packages/esrally/mechanic/mechanic.py", line 589, in on_benchmark_start\n node.on_benchmark_start()\n File "/usr/local/lib/python3.4/site-packages/esrally/mechanic/cluster.py", line 38, in on_benchmark_start\n self.telemetry.on_benchmark_start()\n File "/usr/local/lib/python3.4/site-packages/esrally/mechanic/telemetry.py", line 62, in on_benchmark_start\n device.on_benchmark_start()\n File "/usr/local/lib/python3.4/site-packages/esrally/mechanic/telemetry.py", line 311, in on_benchmark_start\n self.process_start = sysstats.process_io_counters(self.process)\n File "/usr/local/lib/python3.4/site-packages/esrally/utils/sysstats.py", line 64, in process_io_counters\n return handle.io_counters()\n File "/usr/local/lib64/python3.4/site-packages/psutil/init.py", line 703, in io_counters\n return self._proc.io_counters()\n File "/usr/local/lib64/python3.4/site-packages/psutil/_pslinux.py", line 907, in wrapper\n raise AccessDenied(self.pid, self._name)\npsutil.AccessDenied: psutil.AccessDenied (pid=10135)\n')

Hi @Jim_Finnessy,

for some reason there is a problem when installing process I/O counters upon startup of Elasticsearch. As all code (including the spawned) just runs under the same user as you start Rally, it should have proper permissions to read that file.

Can you please add also --keep-cluster-running to your command line arguments and then do a cat /proc/$PID/io when this error occurs? In case this does not work, run ls -la /proc/$PID to determine the permissions on this directory.

You should then kill your Elasticsearch cluster manually as Rally keeps it running after the benchmark.

One other option is that Elasticsearch dies for some reason during (or shortly after) startup and by the time Rally wants to access the I/O counters the process has already died. In that case you should check the log files, specifically /home/ec2-user/.rally/logs/rally-actor-messages.log and the cluster logs in /home/ec2-user/.rally/benchmarks/races/YOUR_RACE_TIMESTAMP/rally-node-0/logs/.



Thanks for the reply!

I'm pretty sure this was user error, but here's what I think happened. I ran it the way you mentioned above. ElasticSearch stayed up, and I noticed that the file $PID/io file was not there. I checked the logs and it looked like it was a rally process that wasn't able to access that PID file for the rally process itself. This is because I found that PID in the log, and it wasn't the PID of the running Elastic Search instance.

I think what may have happened is that I ran esrally as root, and somehow some permissions (maybe on the distribution) were set to be owned by root. I chown'ed everything back to ec2-user in the .rally directory, and things now seem to be running fine. Thanks for pointing me in the right direction!

Hi @Jim_Finnessy,

glad to hear you've resolved it now. Thanks also for the explanation of what was probably wrong.

Rally even checks that you don't run as root before it starts Elasticsearch (because Elasticsearch has a bootstrap check that will prevent it to run as root). We could check this earlier in Rally but I don't want to prevent users that use Rally just a load-generator to run as root (although it is not necessary at all).

Anyway, that's important to know and maybe we can improve usability in this case.


This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.