Querying Multiple Nodes on Same Host

rahulnadella · October 29, 2015, 11:48pm

I am new to ElasticSearch (1.7) and currently running a Logstash (1.5.4) and ElasticSearch on the same host. I am trying to scale my ElasticSearch to have multiple instances. I have been able to get multiple ElasticSearch instances working with data being correctly inserted into their indices. I was wondering how it is possible to retrieve the data from all three instances using one query (able to retrieve from individual instances)?

magnusbaeck · October 30, 2015, 7:59am

Are these instances part of the same cluster? If yes, then you can query any node for any index that's part of the cluster. If no, your only option would be setting up a tribe node.

rahulnadella · October 30, 2015, 11:17am

We are using the same cluster to group the nodes, so basically I would have to specifically assign each index (time-based) to a specific node through Logstash. I think I was using a round-robin so they we appearing in each node, so I was able to retrieve the results from the specific node, but was not able to figure out how to aggregate data from multiple nodes in the same cluster. I currently deploy through Ansible so this probably make a little more easier during deployment if this is the case but make a more difficult to retrieve the data through our Django application.

magnusbaeck · October 30, 2015, 5:57pm

We are using the same cluster to group the nodes, so basically I would have to specifically assign each index (time-based) to a specific node through Logstash.

No... why? Logstash can't control where indexes and shards are allocated.

Again, just connect to any cluster node and let ES deal with everything.

rahulnadella · October 31, 2015, 5:48pm

I currently trying to scale the system vertically. I am still a little unclear on how the data is supposed to go from Logstash to ElasticSearch without having data getting duplicated (same index exists in each node). Currently each index is showing up in each node I have deployed. My configuration settings are below:

ElasticSearch settings (each Node is configured to have its own config, data/work directory, log file)
NODE 1
Cluster Name: Supply
Node Name: es1
Node Master: True
Node Data: True
Minimum Master Nodes: 2
Mulitcast: False
Unicast Hosts: [localhost:9300]

NODE 2
Cluster Name: Supply
Node Name: es2
Node Master: True
Node Data: True
Minimum Master Nodes: 2
Mulitcast: False
Unicast Hosts: [localhost:9300]

NODE 3
Cluster Name: Supply
Node Name: es3
Node Master: True
Node Data: True
Minimum Master Nodes: 2
Mulitcast: False
Unicast Hosts: [localhost:9300]

Logstash Parser output:
elasticsearch {
host => "localhost"
protocol => "http"
port => "9200"
index => "estimate-%{+YYYY-MM-dd}"
cluster => "sv"
template => "/opt/logsearch/templates/logsearch_apple_template.json"
template_name => "apple_template"
template_overwrite => true
}

Thanks for the help.

magnusbaeck · October 31, 2015, 6:42pm

I currently trying to scale the system vertically.

I think you mean horizontally (i.e. increasing the total capacity by adding machines).

Currently each index is showing up in each node I have deployed.

What, exactly, do you mean by this and what makes you reach that conclusion?

rahulnadella · October 31, 2015, 11:51pm

Question 1:
We plan out scaling up to bigger server not more servers (scaling out). We will only have one server in deployment. We were trying to speed up query performance using aggregation due to the indices being to large causing query performance to be slow. We thought by scaling out the number of instances of ElasticSearch we could spread the load across multiple instances which would improve the performance of the aggregation.

Question 2:
Basically I am seeing the same number of indices with the same document count that return the same results in each of the nodes when query it. I not sure that this should be correct if it is then I fine with it, but I am thinking that it is duplicating the same data in 3 nodes. Is there a way to verify this conclusion?

magnusbaeck · November 1, 2015, 2:46pm

We plan out scaling up to bigger server not more servers (scaling out). We will only have one server in deployment. We were trying to speed up query performance using aggregation due to the indices being to large causing query performance to be slow. We thought by scaling out the number of instances of Elasticsearch we could spread the load across multiple instances which would improve the performance of the aggregation.

That's correct. Elasticsearch generally scales better horizontally than vertically.

Basically I am seeing the same number of indices with the same document count that return the same results in each of the nodes when query it. I not sure that this should be correct if it is then I fine with it, but I am thinking that it is duplicating the same data in 3 nodes. Is there a way to verify this conclusion?

Cluster state is global. Any node can be queried regardless of where the data is actually stored. The behavior you see is normal. Don't worry about it.

Topic		Replies	Views
Understanding multiple ES nodes Elasticsearch	2	394	June 11, 2018
Query multiple ES nodes from single Kibana instance Elasticsearch	14	10593	July 5, 2017
How to have multiple nodes on same machine with Elasticsearch 2.0.0 Logstash	20	6842	July 6, 2017
Can I query data of more then one elasticsearch using one kibana Kibana	7	1177	June 28, 2018
Sending data from 2 logstash nodes to an elasticsearch cluster Elasticsearch	5	317	April 18, 2023

Querying Multiple Nodes on Same Host

Related topics