I' m using 14.3 version of Elastic Search. I have configured two data
nodes in separate servers, one non data node & one batch indexer in
separate servers. When i import the huge data (around 16 GB) in
Elastic Search, (actually the data should be shared into two parts
while copying in servers)but here in my case, the data node 1 server
consumes the 80% size in it and data node 2 consumes only less volume.
whereas when i was trying to import less data below 6 GB, both the
data nodes shares equally and it consumes equal space in both the
servers. And in optimization part, it should increase only two times
of original size, but here it takes 5 to 7 times.so automatically am
in situation of cleaning free space of my server if it is there. Else
i need to scale up the HDD size.
Will the problem be in import size(giving huge data)? if so, how do
i fix this issue?
Do i need to setup any settings for sharing equally in both data
node servers?
How do i configure the Optimization option for taking only two
times of size while optimizing?
If you have two nodes, then the same amount of shards should be on both servers (assuming you have 1 replica). So, the varying index size comes from different "state" the shard indices exists at and in when merging of internal segments kicked in.
On Wednesday, February 23, 2011 at 8:47 PM, Meenakshi wrote:
Hi,
I' m using 14.3 version of Elastic Search. I have configured two data
nodes in separate servers, one non data node & one batch indexer in
separate servers. When i import the huge data (around 16 GB) in
Elastic Search, (actually the data should be shared into two parts
while copying in servers)but here in my case, the data node 1 server
consumes the 80% size in it and data node 2 consumes only less volume.
whereas when i was trying to import less data below 6 GB, both the
data nodes shares equally and it consumes equal space in both the
servers. And in optimization part, it should increase only two times
of original size, but here it takes 5 to 7 times.so automatically am
in situation of cleaning free space of my server if it is there. Else
i need to scale up the HDD size.
Will the problem be in import size(giving huge data)? if so, how do
i fix this issue?
Do i need to setup any settings for sharing equally in both data
node servers?
How do i configure the Optimization option for taking only two
times of size while optimizing?
I am not using any replica on this environment. Instead of this, i
rely on my gateway which is configured in one of my server. In this
scenario, how do i get the shards be shared equally? Say for Example:
if i have 160 GB data in gateway, then it should be shared into two
shards as 80 GB + 80 GB. But here, i got 0 Kb size in one shard and
160 GB in another shard. its not sharing even i delete those two
shards from servers, still it shares as i mentioned above.
I found one more scenario, details are as follows..
The same 160 GB data is shared into two shards, initially as 80 GB
each, but after sometime, one shard size gets reduced and the another
shard gets 165 GB.
If you have two nodes, then the same amount of shards should be on both servers (assuming you have 1 replica). So, the varying index size comes from different "state" the shard indices exists at and in when merging of internal segments kicked in.
On Wednesday, February 23, 2011 at 8:47 PM, Meenakshi wrote:
Hi,
I' m using 14.3 version of Elastic Search. I have configured two data
nodes in separate servers, one non data node & one batch indexer in
separate servers. When i import the huge data (around 16 GB) in
Elastic Search, (actually the data should be shared into two parts
while copying in servers)but here in my case, the data node 1 server
consumes the 80% size in it and data node 2 consumes only less volume.
whereas when i was trying to import less data below 6 GB, both the
data nodes shares equally and it consumes equal space in both the
servers. And in optimization part, it should increase only two times
of original size, but here it takes 5 to 7 times.so automatically am
in situation of cleaning free space of my server if it is there. Else
i need to scale up the HDD size.
Will the problem be in import size(giving huge data)? if so, how do
i fix this issue?
Do i need to setup any settings for sharing equally in both data
node servers?
How do i configure the Optimization option for taking only two
times of size while optimizing?
You should not get that. Make sure the nodes discover each other.
On Saturday, February 26, 2011 at 12:09 AM, Jagmee wrote:
Hi,
I am not using any replica on this environment. Instead of this, i
rely on my gateway which is configured in one of my server. In this
scenario, how do i get the shards be shared equally? Say for Example:
if i have 160 GB data in gateway, then it should be shared into two
shards as 80 GB + 80 GB. But here, i got 0 Kb size in one shard and
160 GB in another shard. its not sharing even i delete those two
shards from servers, still it shares as i mentioned above.
I found one more scenario, details are as follows..
The same 160 GB data is shared into two shards, initially as 80 GB
each, but after sometime, one shard size gets reduced and the another
shard gets 165 GB.
If you have two nodes, then the same amount of shards should be on both servers (assuming you have 1 replica). So, the varying index size comes from different "state" the shard indices exists at and in when merging of internal segments kicked in.
On Wednesday, February 23, 2011 at 8:47 PM, Meenakshi wrote:
Hi,
I' m using 14.3 version of Elastic Search. I have configured two data
nodes in separate servers, one non data node & one batch indexer in
separate servers. When i import the huge data (around 16 GB) in
Elastic Search, (actually the data should be shared into two parts
while copying in servers)but here in my case, the data node 1 server
consumes the 80% size in it and data node 2 consumes only less volume.
whereas when i was trying to import less data below 6 GB, both the
data nodes shares equally and it consumes equal space in both the
servers. And in optimization part, it should increase only two times
of original size, but here it takes 5 to 7 times.so automatically am
in situation of cleaning free space of my server if it is there. Else
i need to scale up the HDD size.
Will the problem be in import size(giving huge data)? if so, how do
i fix this issue?
Do i need to setup any settings for sharing equally in both data
node servers?
How do i configure the Optimization option for taking only two
times of size while optimizing?
You should not get that. Make sure the nodes discover each other.
On Saturday, February 26, 2011 at 12:09 AM, Jagmee wrote:
Hi,
I am not using any replica on this environment. Instead of this, i
rely on my gateway which is configured in one of my server. In this
scenario, how do i get the shards be shared equally? Say for Example:
if i have 160 GB data in gateway, then it should be shared into two
shards as 80 GB + 80 GB. But here, i got 0 Kb size in one shard and
160 GB in another shard. its not sharing even i delete those two
shards from servers, still it shares as i mentioned above.
I found one more scenario, details are as follows..
The same 160 GB data is shared into two shards, initially as 80 GB
each, but after sometime, one shard size gets reduced and the another
shard gets 165 GB.
If you have two nodes, then the same amount of shards should be on both servers (assuming you have 1 replica). So, the varying index size comes from different "state" the shard indices exists at and in when merging of internal segments kicked in.
On Wednesday, February 23, 2011 at 8:47 PM, Meenakshi wrote:
Hi,
I' m using 14.3 version of Elastic Search. I have configured two data
nodes in separate servers, one non data node & one batch indexer in
separate servers. When i import the huge data (around 16 GB) in
Elastic Search, (actually the data should be shared into two parts
while copying in servers)but here in my case, the data node 1 server
consumes the 80% size in it and data node 2 consumes only less volume.
whereas when i was trying to import less data below 6 GB, both the
data nodes shares equally and it consumes equal space in both the
servers. And in optimization part, it should increase only two times
of original size, but here it takes 5 to 7 times.so automatically am
in situation of cleaning free space of my server if it is there. Else
i need to scale up the HDD size.
Will the problem be in import size(giving huge data)? if so, how do
i fix this issue?
Do i need to setup any settings for sharing equally in both data
node servers?
How do i configure the Optimization option for taking only two
times of size while optimizing?
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.