Elasticsearch performance tunning

shammi · July 7, 2017, 9:07pm

Hi,

I have single node elasticsearch server, facing performance issues.
Below is my cluster health:

{
"cluster_name": "es-01",
"status": "red",
"timed_out": false,
"number_of_nodes": 1,
"number_of_data_nodes": 1,
"active_primary_shards": 421,
"active_shards": 421,
"relocating_shards": 0,
"initializing_shards": 0,
"unassigned_shards": 621,
"delayed_unassigned_shards": 0,
"number_of_pending_tasks": 0,
"number_of_in_flight_fetch": 0,
"task_max_waiting_in_queue_millis": 0,
"active_shards_percent_as_number": 40.40307101727447
}

Following are the errors i am getting in logstash logs:
> [2017-07-07T21:03:44,716][INFO ][logstash.outputs.elasticsearch] retrying failed action with response code: 503 ({"type"=>"unavailable_shards_exception", "reason"=>"[filebeat-2017.05.14][2] primary shard is not active Timeout: [1m], request: [BulkShardRequest to [filebeat-2017.05.14] containing [1] requests]"})
> [2017-07-07T21:03:44,716][ERROR][logstash.outputs.elasticsearch] Retrying individual actions
> [2017-07-07T21:03:44,716][ERROR][logstash.outputs.elasticsearch] Action

At the moment i am not in a position to add more nodes. Please suggest if i can reconfigure elasticsearch.yml to improve performance.

niraj_kumar · July 7, 2017, 9:26pm

Can you send the output of

http://localhost:9200/_cat/shards?v
http://localhost:9200/_cat/indices?v

--
Niraj

shammi · July 7, 2017, 9:37pm

Hi Neeraj,

The list is too long, so pasting few:

[root@ET-PRD-WEB-LOGS elasticsearch]# curl -XGET http://localhost:9200/_cat/shards?v
index shard prirep state docs store ip node
filebeat-2017.06.03 2 p UNASSIGNED
filebeat-2017.06.03 2 r UNASSIGNED
filebeat-2017.06.03 3 p UNASSIGNED
filebeat-2017.06.03 3 r UNASSIGNED
filebeat-2017.06.03 4 p UNASSIGNED
filebeat-2017.06.03 4 r UNASSIGNED
filebeat-2017.06.03 1 p UNASSIGNED
filebeat-2017.06.03 1 r UNASSIGNED
filebeat-2017.06.03 0 p UNASSIGNED
filebeat-2017.06.03 0 r UNASSIGNED
filebeat-2017.06.01 2 p UNASSIGNED
filebeat-2017.06.01 2 r UNASSIGNED
filebeat-2017.06.01 3 p UNASSIGNED
filebeat-2017.06.01 3 r UNASSIGNED
filebeat-2017.06.01 4 p UNASSIGNED
filebeat-2017.06.01 4 r UNASSIGNED
filebeat-2017.06.01 1 p UNASSIGNED
filebeat-2017.06.01 1 r UNASSIGNED
filebeat-2017.06.01 0 p UNASSIGNED
filebeat-2017.06.01 0 r UNASSIGNED
filebeat-2017.06.24 3 p STARTED 747648 795.1mb 10.1.13.8 elasticsearch-01
filebeat-2017.06.24 3 r UNASSIGNED
filebeat-2017.06.24 2 p STARTED 737287 787.4mb 10.1.13.8 elasticsearch-01
filebeat-2017.06.24 2 r UNASSIGNED
filebeat-2017.06.24 4 p STARTED 754842 804.6mb 10.1.13.8 elasticsearch-01

[root@ET-PRD-WEB-LOGS elasticsearch]# curl -XGET http://localhost:9200/_cat/indices?v
health status index uuid pri rep docs.count docs.deleted store.size pri.store.size
yellow open filebeat-2017.04.30 gKisWjgvT72xyM4bEX5iCA 5 1 387 0 659.6kb 659.6kb

red open filebeat-2017.04.08 xnkhHJvlTlKi4QKh1Tqdew 5 1
yellow open filebeat-2017.06.06 x6SSV9pzT3CKPHt6R4LwLA 5 1 4549189 0 3.6gb 3.6gb

niraj_kumar · July 7, 2017, 9:43pm

No worries. So the thing here is that the shards are in UNASSIGNED state and thus has to be manually re routed. Use the routing API to migrate the shards so that it is in a STARTED state.

Use something like this below to manually assign it using a python script.

import requests
import json

HOSTNAME="your.elasticsearch.host.com" # hostname
PORT=9200 # port number
NODE_NAME="node001" # node to reroute to 

def reroute(index, shard):
  payload = { "commands": [{ "allocate": { "index": index, "shard": shard, "node": NODE_NAME, "allow_primary": 1 } }] }
  res = requests.post("http://" + HOSTNAME + ":" + str(PORT) + "/_cluster/reroute", data=json.dumps(payload))
  print res.text
  pass

res = requests.post("http://" + HOSTNAME + ":" + str(PORT) + "/_flush/synced")
j = res.json()

for field in j:
  if j[field]["failed"] != 0 and field != "_shards":
    for item in j[field]["failures"]:
      reroute(field, item["shard"])

shammi · July 7, 2017, 9:50pm

Hi Neeraj,

Thanks for your reply.
I am not familiar with this script. Is this to migrate the shards to another node?

I have only one node at the moment and my cluster health is red. Can you please suggest how can i make it green to make it work.

Thanks

niraj_kumar · July 7, 2017, 9:55pm

As you have just one node, you can only route it to that node itself. But say if you had three of them you can route to any of the node. So routing will force the shard to be available from unassigned state. I have never tried it on a single node cluster. But you can give a try. If you are not sure what you are doing. Try doing it for one of the shard and see the result.

curl -XPOST 'localhost:9200/_cluster/reroute' -d '{
        "commands" : [ {
              "allocate" : {
                  "index" : "name of the index", 
                  "shard" : 4, 
                  "node" : "name of the node", 
                  "allow_primary" : true
              }
            }
        ]
    }'

The shard can be found in the output of the above.

filebeat-2017.06.03 2 p UNASSIGNED --> So here 2 is the shard number.

shammi · July 7, 2017, 10:01pm

Hi Neeraj,

I ran this and got error:

curl -XPOST 'localhost:9200/_cluster/reroute' -d '{
"commands" : [ {
"allocate" : {
"index" : "filebeat-2017.06.03",
"shard" : 2,
"node" : "elasticsearch-01",
"allow_primary" : true
}
}
]
}'
{"error":{"root_cause":[{"type":"parsing_exception","reason":"no [allocation_command] registered for [allocate]","line":3,"col":28}],"type":"parsing_exception","reason":"[cluster_reroute] failed to parse field [commands]","line":3,"col":28,"caused_by":{"type":"parsing_exception","reason":"no [allocation_command] registered for [allocate]","line":3,"col":28}},"status":400}

shammi · July 7, 2017, 10:32pm

Hi Neeraj,

I have tried the below in Elasticsearch console and looks like it works:

POST _cluster/reroute
'{
        "commands" : [ {
              "allocate" : {
                  "index" : "filebeat-2017.06.03", 
                  "shard" : 2, 
                  "node" : "elasticsearch-01", 
                  "allow_primary" : true
              }
            }
        ]
    }'

After this i have restarted my elasticsearch service and still my node is red:

{
"cluster_name": "es-01",
"status": "red",
"timed_out": false,
"number_of_nodes": 1,
"number_of_data_nodes": 1,
"active_primary_shards": 421,
"active_shards": 421,
"relocating_shards": 0,
"initializing_shards": 0,
"unassigned_shards": 101,
"delayed_unassigned_shards": 0,
"number_of_pending_tasks": 0,
"number_of_in_flight_fetch": 0,
"task_max_waiting_in_queue_millis": 0,
"active_shards_percent_as_number": 80.65134099616859
}

niraj_kumar · July 7, 2017, 10:34pm

So now you have 101 unassigned shards. Repeat the first command and try re routing the all the ones that are in unassigned state and look for unassigned shards in cluster health to reduce. Once it is zero your cluster should be green.

"unassigned_shards": 101,

And if you look back and see the shard API. Trying looking for status of filebeat-2017.06.03 and see what is shows.

--
Niraj

shammi · July 7, 2017, 10:47pm

Hi Neeraj,

After executing this command for couple of unassigned shards. i am still getting 101 unassigned shards:

{
  "cluster_name": "es-01",
  "status": "red",
  "timed_out": false,
  "number_of_nodes": 1,
  "number_of_data_nodes": 1,
  "active_primary_shards": 421,
  "active_shards": 421,
  "relocating_shards": 0,
  "initializing_shards": 0,
  "unassigned_shards": 101,
  "delayed_unassigned_shards": 0,
  "number_of_pending_tasks": 0,
  "number_of_in_flight_fetch": 0,
  "task_max_waiting_in_queue_millis": 0,
  "active_shards_percent_as_number": 80.65134099616859
}

niraj_kumar · July 7, 2017, 11:24pm

Did you check the shards api for the indexes you re routed.? Anything in logs after you run the re-route command. Try tailing it for live logs as you run the re-route api.

--
Niraj

niraj_kumar · July 8, 2017, 12:21am

What is the status of this shard when you run the shard API. Can you post it here?

shammi · July 8, 2017, 3:10am

Hi Neeraj,

When i ran that reroute command it shows some output like:

POST _cluster/reroute

{
"acknowledged": true,
"state": {
"version": 122,
"state_uuid": "3bS69lnbQCiz1B_kxgF7-Q",
"master_node": "Gkujuaz_SHWMdctDwd9i3g",
"blocks": {},
"nodes": {
"Gkujuaz_SHWMdctDwd9i3g": {
"name": "elasticsearch-01",
"ephemeral_id": "EUkJqpqjTKaJxmioVVfW5w",
"transport_address": "10.1.13.8:9300",
"attributes": {}
}
},
"routing_table": {
"indices": {
"filebeat-2017.06.13": {
"shards": {
"0": [
{
"state": "STARTED",
"primary": true,
"node": "Gkujuaz_SHWMdctDwd9i3g",
"relocating_node": null,
"shard": 0,
"index": "filebeat-2017.06.13",
"allocation_id": {
"id": "Q387CDgGSHy6HwVxKWIYbQ"
}
}
],

It still goes on and at the end this is what i can see:
Failed to connect to Console's backend.
Please check the Kibana server is up and running

Cluster health is still "Red".

shammi · July 8, 2017, 5:01am

Hi Neeraj,

I have deleted the old shards which are unassigned. After that my cluster turned "Yellow":

{
"cluster_name": "es-01",
"status": "yellow",
"timed_out": false,
"number_of_nodes": 1,
"number_of_data_nodes": 1,
"active_primary_shards": 296,
"active_shards": 296,
"relocating_shards": 0,
"initializing_shards": 0,
"unassigned_shards": 106,
"delayed_unassigned_shards": 0,
"number_of_pending_tasks": 0,
"number_of_in_flight_fetch": 0,
"task_max_waiting_in_queue_millis": 0,
"active_shards_percent_as_number": 73.6318407960199
}

I have some replicas which shows like:

filebeat-2017.05.25 0 r UNASSIGNED
filebeat-2017.05.15 2 r UNASSIGNED
filebeat-2017.05.15 4 r UNASSIGNED
filebeat-2017.05.15 1 r UNASSIGNED
filebeat-2017.05.15 3 r UNASSIGNED
filebeat-2017.05.15 0 r UNASSIGNED
filebeat-2017.05.19 2 r UNASSIGNED
filebeat-2017.05.19 4 r UNASSIGNED
filebeat-2017.05.19 1 r UNASSIGNED
filebeat-2017.05.19 3 r UNASSIGNED
filebeat-2017.05.19 0 r UNASSIGNED
filebeat-2017.05.13 2 r UNASSIGNED
filebeat-2017.05.13 4 r UNASSIGNED
filebeat-2017.05.13 1 r UNASSIGNED

I am unable to delete them. Kindly suggest how i can remove them and turned my cluster state to Green.

Christian_Dahlqvist · July 8, 2017, 6:07am

It looks like you have indices with a replica configured. As you only have one node, Elasticsearch will never assign these. You can however resolve this by updating the replica count for these indices to 0.

warkolm · July 8, 2017, 8:45am

You have too many shards, reducing that will help performance.

shammi · July 8, 2017, 4:24pm

Hi Mark,

How can i reduce my shards.
Thanks for the help.

warkolm · July 8, 2017, 9:22pm

Look at the _shrink API.

Then edit your templates and reduce the default shard counts in them, and/or use weekly/monthly indices.

niraj_kumar · July 9, 2017, 12:34am

Set the replica count to 0.

PUT /twitter/_settings
{
"index" : {
"number_of_replicas" : 0
}
}

Link to modify the template and reduce shards.

But for existing one's you have to use the shrink api.

system · August 6, 2017, 12:35am

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Primary shard is not active Timeout: [1m] Elasticsearch	1	1218	November 19, 2020
[ERROR][logstash.outputs.elasticsearch] Retrying individual actions Elasticsearch	1	1298	January 12, 2017
Troubleshooting the elastic seach node Elastic Search	3	93	July 21, 2024
Issues with ES 1.4 Status Red, Unassigned shards, etc Elasticsearch	1	465	July 6, 2017
Advice on nodes replicas and shards Elasticsearch	2	505	July 5, 2017

Elasticsearch performance tunning

POST _cluster/reroute

Related topics