Is it possible to search on replica node ONLY?

Eugene_Gwon · November 3, 2016, 9:54am

Hello all,
I'm using Grafana for visualization.
the problem is, If I make too many graphs on single dashboard, It caused to ES freeze - and of course, I can't insert any data to ES.
the only solution for this problem that I found, is release ES cache...

To solve this problem, I added 2 more ES data node, but have no luck
If I can make Grafana to search on replica nodes only, I think it'll be helpful - at least, primary shards will working, isn't it?
but I can't find how to connect to replicas. Is there any guide for this?

thanks for help.

Christian_Dahlqvist · November 3, 2016, 10:05am

Primary and replica shards basically do the same amount of work for indexing, so trying to hit replicas would not help even if it was possible. You probably need to tune your cluster instead.

Which version of Elasticsearch are you using? How much data do you have in the cluster? How many indices/shards do you have in the cluster? What is the specification of the nodes Elasticsearch is running on?

Eugene_Gwon · November 3, 2016, 10:43am

I'm using ES 1.7.4 and have 500GB of data, in 3 indices.
Each Indice have 4 of primary shard and each shard have 1 replica.

I have 3 nodes in my ES cluster. here's spec :

node 1(Master + Data) : 4core 16GB memory, 600GB disk
node 2(Data) : 2core 8GB memory, 600GB disk
node 3(Data) : 2core 8GB memory, 600GB disk

and I did change some values in elasticsearch.yml:

indices.breaker.fielddata.limit: 50%
index.unassigned.node_left.delayed_timeout: 5m
script.disable_dynamic: true
bootstrap.mlockall: true
indices.recovery.max_bytes_per_sec: 60mb

I'm using Fluentd and Graylog. the server receiving almost 6GB of data per day.

Thanks for your help, Christian

Christian_Dahlqvist · November 3, 2016, 10:55am

For this type of use case we usually recommend using time-based indices, which you do not seem to be using. This allows you to only hit indices that can contain data related to the time interval you are querying, and allow you to query less data that way. If you are having problems with heap usage, you may also want to look into using doc values if you are not already. I am not familiar with Graylog and what limitations it may impose with respect to this though.

As your nodes are not the same size it is possible that the smaller nodes are getting overloaded. Have you looked at CPU and IO load across the nodes to try and identify the bottleneck?

Eugene_Gwon · November 3, 2016, 11:26am

It seems like I can't use doc values with Graylog(link)

time-base indices can be helpful, and It seems like Graylog have similar feature like that, but I need to search more.
what If I split data to more shards? can It be helpful?

here's my ES cluster's current situation :

I can see 'fielddata too long' message when I query for data, often

Christian_Dahlqvist · November 3, 2016, 11:34am

If you are having issues with heap and field data, doc_values is generally the way to go. Time based indices may help if you can limit the number of indices you are querying. Just increasing the shard count I think is unlikely to help much. I would raise this with the Graylog community as they may have a better idea about what is and is not possible within the constraints of Graylog.

Eugene_Gwon · November 3, 2016, 11:36am

Thanks for your help, Christian It was really helpful.
Thanks!

Topic		Replies	Views
3 node ES cluster...one node only holds replicas Elasticsearch	10	2108	July 5, 2017
Where is data stored in ES Elasticsearch	8	958	September 18, 2018
Replica shards are not involving while searching Elasticsearch	7	2585	November 16, 2019
Can a data node be set up to only be allocated replicas? Elasticsearch	3	822	July 6, 2017
Shard memory allocation and replicas Elasticsearch	1	317	July 6, 2017

Is it possible to search on replica node ONLY?

Related topics