Multiple ES nodes point on same data location

Hi All,

We are using Elasticsearch for indexing and storage.
ES supports to run multiple nodes on one machines, but it creates different
data locations by using the node ids.
For example:
Index location for Node1 :/home/user/es/data/nodes/0/indices
Index location for Node 2 : /home/user/es/data/nodes/1/indices

But, we want to run multiple nodes on one machine and all nodes will have
same data location: /home/user/es/indices instead of separating data
location by node ids.

Why we want to run multiple nodes that will point to same location?
If one node goes down , then other node will take its position.

Also, we want to achieve high availability without using replication
(replica = 0).

Please share some idea to achieve the above usecase.

Thank you very much in advance

Regards,
Ankit Jain

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

Hi., Ankit Jain

Since you are using 2 Nodes with same clustername[i guess., "data" ]. and
as per my guess., your 2 nodes are able to store data.[<node.data: true>
default setting] so
Elasticsearch will create 2 nodes["/nodes/0/indices" and
"/nodes/1/indices"] in filesystem.

If You want to create same path["/nodes/0/indices"] for 2 nodes then in 2nd
node pass setting like "node.data: false" or "node.client: true" in
elasticserach.yml file..
then the 2 nodes will share the same location and you will able to index
search your data from two nodes.

You are Correct. As per my knowledge., if we use multiple nodes., if one
node got failure then other will give backup for the failed node.

As per my Knowledge., Replica will be give backup, when any shards is going
down/Failure. If you don't want replica then pass "
index.number_of_replicas: 0" value in elasticserach.yml file.

Regards
Mohammad Rafi.

On Thursday, May 9, 2013 4:46:43 PM UTC+5:30, Ankit Jain wrote:

Hi All,

We are using Elasticsearch for indexing and storage.
ES supports to run multiple nodes on one machines, but it creates
different data locations by using the node ids.
For example:
Index location for Node1 :/home/user/es/data/nodes/0/indices
Index location for Node 2 : /home/user/es/data/nodes/1/indices

But, we want to run multiple nodes on one machine and all nodes will have
same data location: /home/user/es/indices instead of separating data
location by node ids.

Why we want to run multiple nodes that will point to same location?
If one node goes down , then other node will take its position.

Also, we want to achieve high availability without using replication
(replica = 0).

Please share some idea to achieve the above usecase.

Thank you very much in advance

Regards,
Ankit Jain

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

Hi Mohammad,

Thanks for the reply,

Let us consider, I have set index.number_of_replicas: 0 in
elasticsearch.yml file.

In above configuration (zero index replica), If machine goes down, then the
data on that machines are not accessible.

Can you suggest, some ways to handle the above scenario? I don't want to
replicate my index.

Thanks,
Ankit Jain

On Thursday, 9 May 2013 17:17:30 UTC+5:30, rafi wrote:

Hi., Ankit Jain

Since you are using 2 Nodes with same clustername[i guess., "data" ]. and
as per my guess., your 2 nodes are able to store data.[<node.data: true>
default setting] so
Elasticsearch will create 2 nodes["/nodes/0/indices" and
"/nodes/1/indices"] in filesystem.

If You want to create same path["/nodes/0/indices"] for 2 nodes then in
2nd node pass setting like "node.data: false" or "node.client: true" in
elasticserach.yml file..
then the 2 nodes will share the same location and you will able to index
search your data from two nodes.

You are Correct. As per my knowledge., if we use multiple nodes., if one
node got failure then other will give backup for the failed node.

As per my Knowledge., Replica will be give backup, when any shards is
going down/Failure. If you don't want replica then pass "
index.number_of_replicas: 0" value in elasticserach.yml file.

Regards
Mohammad Rafi.

On Thursday, May 9, 2013 4:46:43 PM UTC+5:30, Ankit Jain wrote:

Hi All,

We are using Elasticsearch for indexing and storage.
ES supports to run multiple nodes on one machines, but it creates
different data locations by using the node ids.
For example:
Index location for Node1 :/home/user/es/data/nodes/0/indices
Index location for Node 2 : /home/user/es/data/nodes/1/indices

But, we want to run multiple nodes on one machine and all nodes will have
same data location: /home/user/es/indices instead of separating data
location by node ids.

Why we want to run multiple nodes that will point to same location?
If one node goes down , then other node will take its position.

Also, we want to achieve high availability without using replication
(replica = 0).

Please share some idea to achieve the above usecase.

Thank you very much in advance

Regards,
Ankit Jain

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

Hi Ankit Jain.

You wish for two ES nodes running on one machine to safely use one copy of
the data with no replicas of that data, and to offer high availability.

The most likely failure is machine hardware: disk, power, and network being
the most likely. With 0 replicas, the loss of the machine means you have no
availability.

If you have two instances of ES charing the same non-replicated data, then
you are basically saying that if one instance of ES crashes, then the
second instance of ES that is the same code as the crashed instance and has
the exact same bugs as the crashed instance will somehow not crash.

I can tell you that replicas seem very cheap, and are the only true way to
offer high availability. It's recommended that you set up a cluster of at
least 3 nodes and that you configure at least two nodes to elect a master.

My own experience: On a set of 3 relatively small Solaris machines for
testing this case, I created in initial load (index only) of 25 million
documents into one of the nodes. The index was created with 5 shards and 0
replicas during the bulk load.

I then changed the replicas to 2 (for a total of 3 copies of each node),
and the 5 shards were copied automatically to the other nodes in a matter
of a few minutes.

I then bulk-loaded my set of 3 million updates (mix of index + delete), and
all went smoothly.

A few nights ago, there was a failure of the server room's cooling and one
of the machines overheated and powered off. The cluster now only had 2
nodes, but stayed green and available due to those two nodes. We brought up
that 3rd node, and in a minute or so there were now "three in the green".
No data loss, no service loss, and a very smooth and automatic cluster
recovery.

Brian

On Thursday, May 9, 2013 7:16:43 AM UTC-4, Ankit Jain wrote:

Hi All,

We are using Elasticsearch for indexing and storage.
ES supports to run multiple nodes on one machines, but it creates
different data locations by using the node ids.
For example:
Index location for Node1 :/home/user/es/data/nodes/0/indices
Index location for Node 2 : /home/user/es/data/nodes/1/indices

But, we want to run multiple nodes on one machine and all nodes will have
same data location: /home/user/es/indices instead of separating data
location by node ids.

Why we want to run multiple nodes that will point to same location?
If one node goes down , then other node will take its position.
*
*
*
*
Also, we want to achieve high availability without using replication
(replica = 0).

Please share some idea to achieve the above usecase.

Thank you very much in advance

Regards,
Ankit Jain

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

Hi Mohammad,

I tried two run two nodes on one machine (1st node is data node and 2nd
node is client node). If I am querying on client node, then my client node
forward request to data node, and able to get the desire result. But, if my
data node goes down, then I am getting below exception:

Exception in thread "main"
org.elasticsearch.action.search.SearchPhaseExecutionException: Failed to
execute phase [query], total failure; shardFailures
{[na][ipdr_359448][2]: No active shards}{[na][ipdr_359448][1]: No
active shards}{[na][ipdr_359448][0]: No active
shards}{[na][ipdr_359448][4]: No active shards}{[na][ipdr_359448][3]:
No active shards}

Can you guide, how we can access data of data node, if data node goes down?

Thanks,
Ankit Jain

On Thursday, 9 May 2013 17:17:30 UTC+5:30, rafi wrote:

Hi., Ankit Jain

Since you are using 2 Nodes with same clustername[i guess., "data" ]. and
as per my guess., your 2 nodes are able to store data.[<node.data: true>
default setting] so
Elasticsearch will create 2 nodes["/nodes/0/indices" and
"/nodes/1/indices"] in filesystem.

If You want to create same path["/nodes/0/indices"] for 2 nodes then in
2nd node pass setting like "node.data: false" or "node.client: true" in
elasticserach.yml file..
then the 2 nodes will share the same location and you will able to index
search your data from two nodes.

You are Correct. As per my knowledge., if we use multiple nodes., if one
node got failure then other will give backup for the failed node.

As per my Knowledge., Replica will be give backup, when any shards is
going down/Failure. If you don't want replica then pass "
index.number_of_replicas: 0" value in elasticserach.yml file.

Regards
Mohammad Rafi.

On Thursday, May 9, 2013 4:46:43 PM UTC+5:30, Ankit Jain wrote:

Hi All,

We are using Elasticsearch for indexing and storage.
ES supports to run multiple nodes on one machines, but it creates
different data locations by using the node ids.
For example:
Index location for Node1 :/home/user/es/data/nodes/0/indices
Index location for Node 2 : /home/user/es/data/nodes/1/indices

But, we want to run multiple nodes on one machine and all nodes will have
same data location: /home/user/es/indices instead of separating data
location by node ids.

Why we want to run multiple nodes that will point to same location?
If one node goes down , then other node will take its position.

Also, we want to achieve high availability without using replication
(replica = 0).

Please share some idea to achieve the above usecase.

Thank you very much in advance

Regards,
Ankit Jain

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

Hi., Ankit Jain

I tried two run two nodes on one machine (1st node is data node and 2nd
node is client node).
If I am querying on client node, then my client node forward request to
data node, and able to get the desire result.
But, if my data node goes down, then I am getting below exception:

        As Per My Knowledge., The Client Node Works only when a Master 

Node[in your case data node is Master One] Exists. So The Above case is not
the problem from Elasticsearch.
since for client node is dependent node of other Master Node. So we must
have one master node running in your machine always.

how we can access data of data node, if data node goes down?

         InquiringMind., told the solution for above one., read that 

post also.
that is., If there is any chance of getting failure of your master node
due to any reason, *
then you may have more than One Master Node *so that if master got failure
then other master node will give the backup.

so if u want to have backup when master node got failure, then take 2 nodes
as masters and 1 as client as usual Or
just simply, have the both nodes as Masters only.

Best Regards.
Mohammad Rafi.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

Hi., Ankit Jain

   Sorry for Late Responce., 

I tried two run two nodes on one machine (1st node is data node and 2nd
node is client node). If I am querying on client node, then my client node
forward request to data node, and able to get the desire >> result. But, if
my data node goes down, then I am getting below exception:
Exception in thread "main"
org.elasticsearch.action.search.SearchPhaseExecutionException: Failed to
execute phase [query], total failure; shardFailures
{[na][ipdr_359448][2]: No active shards}{[na] >> [ipdr_359448][1]: No
active shards}{[na][ipdr_359448][0]: No active
shards}{[na][ipdr_359448][4]: No active shards}{[na][ipdr_359448][3]:
No active shards}

     Actually you are using one Master Node[data node in your case] and 

one client node. The Client Node is working Fine when the Master Node is
Available. since the client node is internally using master node for
indexing and searching. so with out Master node the Client Node Doesn't
Work. So You got that error.

Can you guide, how we can access data of data node, if data node goes
down?

      If you have any Chance to Lost your data, in that case use, the 

following settings...

  1. prepare 2 master Nodes [or 2 master nodes and 1 client node] <ignore
    this "<>msg" if u r not using bulk indexing {if your ram size is greater
    than 60gb then 2 nodes give optimum result, other wise, take care the
    inflow speed for indexing is low with out flow speed of index operation for
    bulk indexing } >
  2. create index with with minimum of 2 primary shards and minimum of 1
    replica.

So in the above case if 1 master node gone Down then another Master give
the backup for the lost data.

Thanks And Regards,
Mohammad Rafi

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.