Hi:
I have a single node Elasticsearch instance running (version 1.0.0).
This instance was configured
with multicast false and no unicast IPs specified. and I change the default
ports from 9200,9300 to 9600,9700 with 5 Shards and no replication.
I just added a new node to this instance like so :
on a new server, I used the same exact Elasticsearch version, with the
configuration file copied over from the above instance, modified the
unicast list of hosts to include the IP address of the above server
instance.
I started up the node instance with this configuration. I then checked on
the cluster and saw the new instance being reported as an additional node.
I then used curl to up the replication factor to 2
My questions are these :
How long does it take for the data to be synchronized /distributed so
that the data is available to be queried properly ?
Is the process I follow above flawed ? (are there any issues with it and
Can I recover by stopping the cluster and restarting them with proper
configuration set)
Prior to addition of the node, I was able to query for documents older
than 35 days (now-35d) but after the addition, this data is not available.
a query for match_all returns the right number of documents, except that
the older documents do not seem to be query able . If the new node that was
added goes away how is the data affected ?
1 - Depends on how much data you have.
2 - Yes, two replicas will mean one will never be assigned. This is because
you have 2 nodes but 3 copies of the data. Set replica to just 1.
3 - That sounds very unusual. Have you tried to fetch one of these
documents via id?
Hi:
I have a single node Elasticsearch instance running (version 1.0.0).
This instance was configured
with multicast false and no unicast IPs specified. and I change the
default ports from 9200,9300 to 9600,9700 with 5 Shards and no replication.
I just added a new node to this instance like so :
on a new server, I used the same exact Elasticsearch version, with the
configuration file copied over from the above instance, modified the
unicast list of hosts to include the IP address of the above server
instance.
I started up the node instance with this configuration. I then checked on
the cluster and saw the new instance being reported as an additional node.
I then used curl to up the replication factor to 2
My questions are these :
How long does it take for the data to be synchronized /distributed so
that the data is available to be queried properly ?
Is the process I follow above flawed ? (are there any issues with it
and Can I recover by stopping the cluster and restarting them with proper
configuration set)
Prior to addition of the node, I was able to query for documents older
than 35 days (now-35d) but after the addition, this data is not available.
a query for match_all returns the right number of documents, except that
the older documents do not seem to be query able . If the new node that was
added goes away how is the data affected ?
I was running into heap memory issues on the new node, and the Cluster
state went form being Yellow to Green almost immediately.
The problem with my query was not so much with lack of data or data not
being replicated/copied over to the new node, But something see hokey with
the date math.
1 - Depends on how much data you have.
2 - Yes, two replicas will mean one will never be assigned. This is
because you have 2 nodes but 3 copies of the data. Set replica to just 1.
3 - That sounds very unusual. Have you tried to fetch one of these
documents via id?
Hi:
I have a single node Elasticsearch instance running (version 1.0.0).
This instance was configured
with multicast false and no unicast IPs specified. and I change the
default ports from 9200,9300 to 9600,9700 with 5 Shards and no replication.
I just added a new node to this instance like so :
on a new server, I used the same exact Elasticsearch version, with the
configuration file copied over from the above instance, modified the
unicast list of hosts to include the IP address of the above server
instance.
I started up the node instance with this configuration. I then checked on
the cluster and saw the new instance being reported as an additional node.
I then used curl to up the replication factor to 2
My questions are these :
How long does it take for the data to be synchronized /distributed so
that the data is available to be queried properly ?
Is the process I follow above flawed ? (are there any issues with it
and Can I recover by stopping the cluster and restarting them with proper
configuration set)
Prior to addition of the node, I was able to query for documents older
than 35 days (now-35d) but after the addition, this data is not available.
a query for match_all returns the right number of documents, except that
the older documents do not seem to be query able . If the new node that was
added goes away how is the data affected ?
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.