Elasticsearch performance tunning

Hi,

I have single node elasticsearch server, facing performance issues.
Below is my cluster health:

{
"cluster_name": "es-01",
"status": "red",
"timed_out": false,
"number_of_nodes": 1,
"number_of_data_nodes": 1,
"active_primary_shards": 421,
"active_shards": 421,
"relocating_shards": 0,
"initializing_shards": 0,
"unassigned_shards": 621,
"delayed_unassigned_shards": 0,
"number_of_pending_tasks": 0,
"number_of_in_flight_fetch": 0,
"task_max_waiting_in_queue_millis": 0,
"active_shards_percent_as_number": 40.40307101727447
}

Following are the errors i am getting in logstash logs:
> [2017-07-07T21:03:44,716][INFO ][logstash.outputs.elasticsearch] retrying failed action with response code: 503 ({"type"=>"unavailable_shards_exception", "reason"=>"[filebeat-2017.05.14][2] primary shard is not active Timeout: [1m], request: [BulkShardRequest to [filebeat-2017.05.14] containing [1] requests]"})
> [2017-07-07T21:03:44,716][ERROR][logstash.outputs.elasticsearch] Retrying individual actions
> [2017-07-07T21:03:44,716][ERROR][logstash.outputs.elasticsearch] Action

At the moment i am not in a position to add more nodes. Please suggest if i can reconfigure elasticsearch.yml to improve performance.

Can you send the output of

http://localhost:9200/_cat/shards?v
http://localhost:9200/_cat/indices?v

--
Niraj

Hi Neeraj,

The list is too long, so pasting few:

[root@ET-PRD-WEB-LOGS elasticsearch]# curl -XGET http://localhost:9200/_cat/shards?v
index shard prirep state docs store ip node
filebeat-2017.06.03 2 p UNASSIGNED
filebeat-2017.06.03 2 r UNASSIGNED
filebeat-2017.06.03 3 p UNASSIGNED
filebeat-2017.06.03 3 r UNASSIGNED
filebeat-2017.06.03 4 p UNASSIGNED
filebeat-2017.06.03 4 r UNASSIGNED
filebeat-2017.06.03 1 p UNASSIGNED
filebeat-2017.06.03 1 r UNASSIGNED
filebeat-2017.06.03 0 p UNASSIGNED
filebeat-2017.06.03 0 r UNASSIGNED
filebeat-2017.06.01 2 p UNASSIGNED
filebeat-2017.06.01 2 r UNASSIGNED
filebeat-2017.06.01 3 p UNASSIGNED
filebeat-2017.06.01 3 r UNASSIGNED
filebeat-2017.06.01 4 p UNASSIGNED
filebeat-2017.06.01 4 r UNASSIGNED
filebeat-2017.06.01 1 p UNASSIGNED
filebeat-2017.06.01 1 r UNASSIGNED
filebeat-2017.06.01 0 p UNASSIGNED
filebeat-2017.06.01 0 r UNASSIGNED
filebeat-2017.06.24 3 p STARTED 747648 795.1mb 10.1.13.8 elasticsearch-01
filebeat-2017.06.24 3 r UNASSIGNED
filebeat-2017.06.24 2 p STARTED 737287 787.4mb 10.1.13.8 elasticsearch-01
filebeat-2017.06.24 2 r UNASSIGNED
filebeat-2017.06.24 4 p STARTED 754842 804.6mb 10.1.13.8 elasticsearch-01

[root@ET-PRD-WEB-LOGS elasticsearch]# curl -XGET http://localhost:9200/_cat/indices?v
health status index uuid pri rep docs.count docs.deleted store.size pri.store.size
yellow open filebeat-2017.04.30 gKisWjgvT72xyM4bEX5iCA 5 1 387 0 659.6kb 659.6kb

red open filebeat-2017.04.08 xnkhHJvlTlKi4QKh1Tqdew 5 1
yellow open filebeat-2017.06.06 x6SSV9pzT3CKPHt6R4LwLA 5 1 4549189 0 3.6gb 3.6gb

No worries. So the thing here is that the shards are in UNASSIGNED state and thus has to be manually re routed. Use the routing API to migrate the shards so that it is in a STARTED state.

Use something like this below to manually assign it using a python script.

import requests
import json

HOSTNAME="your.elasticsearch.host.com" # hostname
PORT=9200 # port number
NODE_NAME="node001" # node to reroute to 

def reroute(index, shard):
  payload = { "commands": [{ "allocate": { "index": index, "shard": shard, "node": NODE_NAME, "allow_primary": 1 } }] }
  res = requests.post("http://" + HOSTNAME + ":" + str(PORT) + "/_cluster/reroute", data=json.dumps(payload))
  print res.text
  pass

res = requests.post("http://" + HOSTNAME + ":" + str(PORT) + "/_flush/synced")
j = res.json()

for field in j:
  if j[field]["failed"] != 0 and field != "_shards":
    for item in j[field]["failures"]:
      reroute(field, item["shard"])

Hi Neeraj,

Thanks for your reply.
I am not familiar with this script. Is this to migrate the shards to another node?

I have only one node at the moment and my cluster health is red. Can you please suggest how can i make it green to make it work.

Thanks

As you have just one node, you can only route it to that node itself. But say if you had three of them you can route to any of the node. So routing will force the shard to be available from unassigned state. I have never tried it on a single node cluster. But you can give a try. If you are not sure what you are doing. Try doing it for one of the shard and see the result.

curl -XPOST 'localhost:9200/_cluster/reroute' -d '{
        "commands" : [ {
              "allocate" : {
                  "index" : "name of the index", 
                  "shard" : 4, 
                  "node" : "name of the node", 
                  "allow_primary" : true
              }
            }
        ]
    }'

The shard can be found in the output of the above.

filebeat-2017.06.03 2 p UNASSIGNED --> So here 2 is the shard number.

Hi Neeraj,

I ran this and got error:

curl -XPOST 'localhost:9200/_cluster/reroute' -d '{
"commands" : [ {
"allocate" : {
"index" : "filebeat-2017.06.03",
"shard" : 2,
"node" : "elasticsearch-01",
"allow_primary" : true
}
}
]
}'
{"error":{"root_cause":[{"type":"parsing_exception","reason":"no [allocation_command] registered for [allocate]","line":3,"col":28}],"type":"parsing_exception","reason":"[cluster_reroute] failed to parse field [commands]","line":3,"col":28,"caused_by":{"type":"parsing_exception","reason":"no [allocation_command] registered for [allocate]","line":3,"col":28}},"status":400}

Hi Neeraj,

I have tried the below in Elasticsearch console and looks like it works:

POST _cluster/reroute
'{
        "commands" : [ {
              "allocate" : {
                  "index" : "filebeat-2017.06.03", 
                  "shard" : 2, 
                  "node" : "elasticsearch-01", 
                  "allow_primary" : true
              }
            }
        ]
    }'

After this i have restarted my elasticsearch service and still my node is red:

{
"cluster_name": "es-01",
"status": "red",
"timed_out": false,
"number_of_nodes": 1,
"number_of_data_nodes": 1,
"active_primary_shards": 421,
"active_shards": 421,
"relocating_shards": 0,
"initializing_shards": 0,
"unassigned_shards": 101,
"delayed_unassigned_shards": 0,
"number_of_pending_tasks": 0,
"number_of_in_flight_fetch": 0,
"task_max_waiting_in_queue_millis": 0,
"active_shards_percent_as_number": 80.65134099616859
}

So now you have 101 unassigned shards. Repeat the first command and try re routing the all the ones that are in unassigned state and look for unassigned shards in cluster health to reduce. Once it is zero your cluster should be green.

"unassigned_shards": 101,

And if you look back and see the shard API. Trying looking for status of filebeat-2017.06.03 and see what is shows.

--
Niraj

Hi Neeraj,

After executing this command for couple of unassigned shards. i am still getting 101 unassigned shards:

{
  "cluster_name": "es-01",
  "status": "red",
  "timed_out": false,
  "number_of_nodes": 1,
  "number_of_data_nodes": 1,
  "active_primary_shards": 421,
  "active_shards": 421,
  "relocating_shards": 0,
  "initializing_shards": 0,
  "unassigned_shards": 101,
  "delayed_unassigned_shards": 0,
  "number_of_pending_tasks": 0,
  "number_of_in_flight_fetch": 0,
  "task_max_waiting_in_queue_millis": 0,
  "active_shards_percent_as_number": 80.65134099616859
}

Did you check the shards api for the indexes you re routed.? Anything in logs after you run the re-route command. Try tailing it for live logs as you run the re-route api.

--
Niraj

What is the status of this shard when you run the shard API. Can you post it here?

Hi Neeraj,

When i ran that reroute command it shows some output like:

POST _cluster/reroute

{
"acknowledged": true,
"state": {
"version": 122,
"state_uuid": "3bS69lnbQCiz1B_kxgF7-Q",
"master_node": "Gkujuaz_SHWMdctDwd9i3g",
"blocks": {},
"nodes": {
"Gkujuaz_SHWMdctDwd9i3g": {
"name": "elasticsearch-01",
"ephemeral_id": "EUkJqpqjTKaJxmioVVfW5w",
"transport_address": "10.1.13.8:9300",
"attributes": {}
}
},
"routing_table": {
"indices": {
"filebeat-2017.06.13": {
"shards": {
"0": [
{
"state": "STARTED",
"primary": true,
"node": "Gkujuaz_SHWMdctDwd9i3g",
"relocating_node": null,
"shard": 0,
"index": "filebeat-2017.06.13",
"allocation_id": {
"id": "Q387CDgGSHy6HwVxKWIYbQ"
}
}
],

It still goes on and at the end this is what i can see:
Failed to connect to Console's backend.
Please check the Kibana server is up and running

Cluster health is still "Red".

Hi Neeraj,

I have deleted the old shards which are unassigned. After that my cluster turned "Yellow":

{
"cluster_name": "es-01",
"status": "yellow",
"timed_out": false,
"number_of_nodes": 1,
"number_of_data_nodes": 1,
"active_primary_shards": 296,
"active_shards": 296,
"relocating_shards": 0,
"initializing_shards": 0,
"unassigned_shards": 106,
"delayed_unassigned_shards": 0,
"number_of_pending_tasks": 0,
"number_of_in_flight_fetch": 0,
"task_max_waiting_in_queue_millis": 0,
"active_shards_percent_as_number": 73.6318407960199
}

I have some replicas which shows like:

filebeat-2017.05.25 0 r UNASSIGNED
filebeat-2017.05.15 2 r UNASSIGNED
filebeat-2017.05.15 4 r UNASSIGNED
filebeat-2017.05.15 1 r UNASSIGNED
filebeat-2017.05.15 3 r UNASSIGNED
filebeat-2017.05.15 0 r UNASSIGNED
filebeat-2017.05.19 2 r UNASSIGNED
filebeat-2017.05.19 4 r UNASSIGNED
filebeat-2017.05.19 1 r UNASSIGNED
filebeat-2017.05.19 3 r UNASSIGNED
filebeat-2017.05.19 0 r UNASSIGNED
filebeat-2017.05.13 2 r UNASSIGNED
filebeat-2017.05.13 4 r UNASSIGNED
filebeat-2017.05.13 1 r UNASSIGNED

I am unable to delete them. Kindly suggest how i can remove them and turned my cluster state to Green.

It looks like you have indices with a replica configured. As you only have one node, Elasticsearch will never assign these. You can however resolve this by updating the replica count for these indices to 0.

You have too many shards, reducing that will help performance.

Hi Mark,

How can i reduce my shards.
Thanks for the help.

Look at the _shrink API.

Then edit your templates and reduce the default shard counts in them, and/or use weekly/monthly indices.

Set the replica count to 0.

PUT /twitter/_settings
{
"index" : {
"number_of_replicas" : 0
}
}

Link to modify the template and reduce shards.

But for existing one's you have to use the shrink api.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.