I have a single-node Hadoop machine running in OpenStack along with my 9-node (3 Masters, 3 Data, 3 Client) elasticsearch 2.2.0 cluster. My intention is to store periodic snapshots in the hdfs store of that single-node hadoop machine. The hadoop processes seem to be running fine:
Without doing anything else, I get an i_o_exception saying Mkdirs failed to create file:/usr/share/elasticsearch/snaps/tests-SEbzlD19QSa9XYPL3A5aeA. Why is elasticsearch trying to create these test files in local filesystem? Shouldn't it be creating them in the hdfs store I provided in the JSON above?
I created a /usr/share/elasticsearch/snaps/ directory with 777 permissions. Now I get repository_verification_exception. Here is the response I get (semi-formatted for readability):
[
[hdfs_repo12]
[
[aZmOht1qQEGtbDkfujD7sw, 'RemoteTransportException[
[[
[master1]
[
[192.168.10.227:9300]
[
[internal:admin/repository/verify]
]
; nested: RepositoryVerificationException[
[[
[hdfs_repo12]
a file written by master to the store [
[file:/usr/share/elasticsearch/snaps]
cannot be accessed on the node [
[{master1}{aZmOht1qQEGtbDkfujD7sw}{192.168.10.227}{192.168.10.227:9300}{data=false, master=true}]
. This might indicate that the store [
[file:/usr/share/elasticsearch/snaps]
is not shared between this node and the master node or that permissions on the store don't allow reading files written by the master node]
;']
, [
[KLQe6APPSEWBRua0_L0eEw, 'RemoteTransportException[
[[
[data2]
[
[192.168.10.231:9300]
[
[internal:admin/repository/verify]
]
; nested: RepositoryVerificationException[
[[
[hdfs_repo12]
a file written by master to the store [
[file:/usr/share/elasticsearch/snaps]
cannot be accessed on the node [
[{data2}{KLQe6APPSEWBRua0_L0eEw}{192.168.10.231}{192.168.10.231:9300}{master=false}]
. This might indicate that the store [
[file:/usr/share/elasticsearch/snaps]
is not shared between this node and the master node or that permissions on the store don't allow reading files written by the master node]
;']
, [
[MycDJAiRTwaLsG4hvh_sGw, 'RemoteTransportException[
[[
[data3]
[
[192.168.10.232:9300]
[
[internal:admin/repository/verify]
]
; nested: RepositoryVerificationException[
[[
[hdfs_repo12]
a file written by master to the store [
[file:/usr/share/elasticsearch/snaps]
cannot be accessed on the node [
[{data3}{MycDJAiRTwaLsG4hvh_sGw}{192.168.10.232}{192.168.10.232:9300}{master=false}]
. This might indicate that the store [
[file:/usr/share/elasticsearch/snaps]
is not shared between this node and the master node or that permissions on the store don't allow reading files written by the master node]
;']
, [
[IK4RltZ2ToqEQMcxpWOw1w, 'RemoteTransportException[
[[
[data1]
[
.
.
.
(truncated)
The relevant error block in the logs is as follows:
[2016-04-08 06:09:29,542][INFO ][rest.suppressed ] /_snapshot/hdfs_repo16 Params: {wait_for_timeout=true, repository=hdfs_repo16}
RemoteTransportException[[master2][192.168.10.228:9300][cluster:admin/repository/put]]; nested: RepositoryVerificationException[[hdfs_repo16] path is not accessible on master node]; nested: NotSerializableExceptionWrapper[Cannot run program "chmod": error=13, Permission denied]; nested: NotSerializableExceptionWrapper[error=13, Permission denied];
Caused by: RepositoryVerificationException[[hdfs_repo16] path is not accessible on master node]; nested: NotSerializableExceptionWrapper[Cannot run program "chmod": error=13, Permission denied]; nested: NotSerializableExceptionWrapper[error=13, Permission denied];
at org.elasticsearch.repositories.blobstore.BlobStoreRepository.startVerification(BlobStoreRepository.java:650)
at org.elasticsearch.repositories.RepositoriesService.verifyRepository(RepositoriesService.java:211)
at org.elasticsearch.repositories.RepositoriesService$VerifyingRegisterRepositoryListener.onResponse(RepositoriesService.java:436)
at org.elasticsearch.repositories.RepositoriesService$VerifyingRegisterRepositoryListener.onResponse(RepositoriesService.java:421)
at org.elasticsearch.cluster.AckedClusterStateUpdateTask.onAllNodesAcked(AckedClusterStateUpdateTask.java:63)
at org.elasticsearch.cluster.service.InternalClusterService$SafeAckedClusterStateTaskListener.onAllNodesAcked(InternalClusterService.java:723)
at org.elasticsearch.cluster.service.InternalClusterService$AckCountDownListener.onNodeAck(InternalClusterService.java:1003)
at org.elasticsearch.cluster.service.InternalClusterService$DelegetingAckListener.onNodeAck(InternalClusterService.java:942)
at org.elasticsearch.cluster.service.InternalClusterService.runTasksForExecutor(InternalClusterService.java:627)
at org.elasticsearch.cluster.service.InternalClusterService$UpdateTask.run(InternalClusterService.java:762)
at org.elasticsearch.common.util.concurrent.PrioritizedEsThreadPoolExecutor$TieBreakingPrioritizedRunnable.runAndClean(PrioritizedEsThreadPoolExecutor.java:231)
at org.elasticsearch.common.util.concurrent.PrioritizedEsThreadPoolExecutor$TieBreakingPrioritizedRunnable.run(PrioritizedEsThreadPoolExecutor.java:194)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
Caused by: NotSerializableExceptionWrapper[Cannot run program "chmod": error=13, Permission denied]; nested: NotSerializableExceptionWrapper[error=13, Permission denied];
at java.lang.ProcessBuilder.start(ProcessBuilder.java:1048)
at org.apache.hadoop.util.Shell.runCommand(Shell.java:486)
at org.apache.hadoop.util.Shell.run(Shell.java:456)
at org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:722)
at org.apache.hadoop.util.Shell.execCommand(Shell.java:815)
at org.apache.hadoop.util.Shell.execCommand(Shell.java:798)
at org.apache.hadoop.fs.RawLocalFileSystem.setPermission(RawLocalFileSystem.java:728)
at org.apache.hadoop.fs.RawLocalFileSystem$LocalFSFileOutputStream.<init>(RawLocalFileSystem.java:225)
Likely you have some other snapshot process happening. This is likely indicated by /snaps/tests
Looks like a permission error - do you have seccomp installed by any chance (double check the ES logs when it starts up). Can the ES user invoke chmod?
Hi @costin, thanks for the reply. I'm confident that there isn't any snapshot operation in progress, because I couldn't create any repository successfully. To check anyway, I hit each repository's /_snapshot/<repo_name/_status API, and they all return with this JSON:
{
"snapshots": []
}
My nodes don't find the binary seccomp when I try to execute it, so I don't seem to have that installed. Anything particular to look for in ES logs? (The search "seccomp" doesn't give any results.)
Any idea on how do I check if ES can invoke `chmod`? I tried this:
[root@master2 elasticsearch]# su elasticsearch
This account is currently not available.
On trying this with normal user it prompts me for a password, but I haven't explicitly configured any password for the elasticsearch linux user.
Plus, chmod does have the execute permissions for all users:
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.