Active primary shards too many? Health status Yellow

Hello all, I am quite new to ElasticSearch and experiencing some issues.

I am using the AWS ES service and this is its health:

StatusYellow
Number of nodes 3
Number of data nodes 3
Active primary shards 26
Active shards 47
Relocating shards 0
Initializing shards 0
Unassigned shards 5

Is this number of shards reasonable for a small amount of data that I currently have ?

The health status is yellow as you only have one node and Elasticsearch therefore is not able to allocate the replica shards. A single shard can hold a loti of data, so you should be able able to reduce the number of shards if you have little data.

Hey Theo,

With only one node, your cluster will stay in status "yellow" because your primary shards are allocated but there isn't another node to hold any replicas. Firing up another node should help.

You can check out this blog post concerning shards. There are no cookie cutter settings, you have to figure out what works for your system. At the end of the blog there is a link to the Definitive Guide which is a great read if you haven't already.

Thanks for coming back to me, @Christian_Dahlqvist and @tdasch.

Status Yellow
Number of nodes 2
Number of data nodes 2
Active primary shards 26
Active shards 47
Relocating shards 0
Initializing shards 0
Unassigned shards 5

@tdasch, fired up a new node but no luck unfortunately. See above.

Also, this is how my shards look like:

.kibana      0 p STARTED      2  11.6kb x.x.x.x  lehWQZM
.kibana      0 r STARTED      2  11.6kb x.x.x.x OcLjQL3
customtitles 2 p STARTED      9 124.8kb x.x.x.x  lehWQZM
customtitles 2 r STARTED      9 124.8kb x.x.x.x OcLjQL3
customtitles 3 r STARTED      2    35kb x.x.x.x  lehWQZM
customtitles 3 p STARTED      2    35kb x.x.x.x OcLjQL3
customtitles 1 r STARTED      3  52.3kb x.x.x.x  lehWQZM
customtitles 1 p STARTED      3  52.3kb x.x.x.x OcLjQL3
customtitles 4 p STARTED      7 106.1kb x.x.x.x  lehWQZM
customtitles 4 r STARTED      7 106.1kb x.x.x.x OcLjQL3
customtitles 0 p STARTED      3  52.3kb x.x.x.x  lehWQZM
customtitles 0 r STARTED      3  52.3kb x.x.x.x OcLjQL3
records      2 p STARTED    120 188.2kb x.x.x.x  lehWQZM
records      2 r STARTED    120 188.2kb x.x.x.x OcLjQL3
records      2 r UNASSIGNED                           
records      3 p STARTED    103  91.9kb x.x.x.x  lehWQZM
records      3 r STARTED    103  91.9kb x.x.x.x OcLjQL3
records      3 r UNASSIGNED                           
records      1 p STARTED    129 181.1kb x.x.x.x  lehWQZM
records      1 r STARTED    129 181.1kb x.x.x.x OcLjQL3
records      1 r UNASSIGNED                           
records      4 p STARTED    117 151.8kb x.x.x.x  lehWQZM
records      4 r STARTED    117 151.8kb x.x.x.x OcLjQL3
records      4 r UNASSIGNED                           
records      0 p STARTED    118 174.9kb x.x.x.x  lehWQZM
records      0 r STARTED    118 174.9kb x.x.x.x OcLjQL3
records      0 r UNASSIGNED                           
plastic         2 p STARTED     15  33.6kb x.x.x.x  lehWQZM
plastic         3 p STARTED     24  50.2kb x.x.x.x  lehWQZM
plastic         1 p STARTED     16  38.5kb x.x.x.x OcLjQL3
plastic         4 p STARTED     25  58.9kb x.x.x.x OcLjQL3
plastic        0 p STARTED     14  25.1kb x.x.x.x OcLjQL3
recordtypes  2 r STARTED      0    230b x.x.x.x  lehWQZM
recordtypes  2 p STARTED      0    230b x.x.x.x OcLjQL3
recordtypes  3 p STARTED      0    230b x.x.x.x  lehWQZM
recordtypes  3 r STARTED      0    230b x.x.x.x OcLjQL3
recordtypes  1 p STARTED      0    230b x.x.x.x  lehWQZM
recordtypes  1 r STARTED      0    230b x.x.x.x OcLjQL3
recordtypes  4 r STARTED      0    230b x.x.x.x  lehWQZM
recordtypes  4 p STARTED      0    230b x.x.x.x OcLjQL3
recordtypes  0 r STARTED      0    230b x.x.x.x  lehWQZM
recordtypes  0 p STARTED      0    230b x.x.x.x OcLjQL3
contributors 2 r STARTED      3  12.2kb x.x.x.x  lehWQZM
contributors 2 p STARTED      3  12.2kb x.x.x.x OcLjQL3
contributors 3 p STARTED      5  20.2kb x.x.x.x  lehWQZM
contributors 3 r STARTED      5  20.2kb x.x.x.x OcLjQL3
contributors 1 p STARTED      6  20.3kb x.x.x.x  lehWQZM
contributors 1 r STARTED      6  20.3kb x.x.x.x OcLjQL3
contributors 4 r STARTED      6  20.3kb x.x.x.x  lehWQZM
contributors 4 p STARTED      6  20.3kb x.x.x.x OcLjQL3
contributors 0 r STARTED      4  12.3kb x.x.x.x  lehWQZM
contributors 0 p STARTED      4  12.3kb x.x.x.x OcLjQL3

Update
It's getting worse now as the response times are 4-6 sec!

Can you provide your settings for number_of_shards and number_of_replicas please? Could we also try using the explain API to get the reason for the unassigned shards? Here is a link for the explain API.

Sure! So, when I curl GET https://search-es-idhere.eu-west-2.es.amazonaws.com/myindex/_settings

{
    "myindex": {
        "settings": {
            "index": {
                "creation_date": "1528217194254",
                "number_of_shards": "5",
                "number_of_replicas": "0",
                "uuid": "wefwefL-W-s3",
                "version": {
                    "created": "6020299"
                },
                "provided_name": "myindex"
            }
        }
    }
}

Hope that helps?

BTW did you look at https://www.elastic.co/cloud and https://aws.amazon.com/marketplace/pp/B01N6YCISK ?

Cloud by elastic is the only way to have access to X-Pack. Think about what is there yet like Security, Monitoring, Reporting and what is coming like Canvas, SQL...

Thanks, @dadoonet! No I didn't have a look into it.
I would have believed that AWS ES would have been able to help me manage my ES nodes and make it easier for a beginner with ES, like me

Here is the response from the explain API:

{
        "index": "records",
        "shard": 2,
        "primary": false,
        "current_state": "unassigned",
        "unassigned_info": {
            "reason": "NODE_LEFT",
            "at": "2018-06-13T09:52:24.369Z",
            "details": "node_left[G9tyyCV-SDSCHBqJSlMLsA]",
            "last_allocation_status": "no_attempt"
        },
        "can_allocate": "no",
        "allocate_explanation": "cannot allocate because allocation is not permitted to any of the nodes",
        "node_allocation_decisions": [
            {
                "node_id": "OcLjQL3ZTwKwjhAeGGJGDA",
                "node_name": "OcLjQL3",
                "node_decision": "no",
                "deciders": [
                    {
                        "decider": "same_shard",
                        "decision": "NO",
                        "explanation": "the shard cannot be allocated to the same node on which a copy of the shard already exists [[records][2], node[OcLjQL3ZTwKwjhAeGGJGDA], [R], s[STARTED], a[id=KGE7aXFWQrumdowO1WKF9Q]]"
                    }
                ]
            },
            {
                "node_id": "lehWQZMES8mWn1dcFN8Iew",
                "node_name": "lehWQZM",
                "node_decision": "no",
                "deciders": [
                    {
                        "decider": "same_shard",
                        "decision": "NO",
                        "explanation": "the shard cannot be allocated to the same node on which a copy of the shard already exists [[records][2], node[lehWQZMES8mWn1dcFN8Iew], [P], s[STARTED], a[id=aXZX9gIPS1qYK6XROdKYzA]]"
                    }
                ]
            }
        ]
    }

You seem to have 2 replicas configured for this index, which need 3 data nodes in order to be fully assigned.

1 Like

@Christian_Dahlqvist I see, then I should reduce the number of replicas to 1?

That will solve the problem.

Indeed! Many thanks, @Christian_Dahlqvist!

Provided the chance, my response times are definitely better now but vary between 200ms to 1.56s.
Where should I possibly look at for this issue?

to help me manage my ES nodes and make it easier for a beginner with ES, like me

Indeed. That's exactly what cloud by elastic offers. You have 15 days totally free to try it if you go to cloud.elastic.co and register from there.

Looking good! We will definitely consider it. Thanks, @dadoonet.

@Christian_Dahlqvist, the problem was not enough nodes. I wanted to ensure I understood correctly how to determine that; You would use the equation N >=R + 1 correct? Where N is the number of nodes in the cluster and R is the largest shared replication factor across all indices in the cluster.

So in Theo's case, 2 >= 2 +1, ? Requiring 3 nodes for proper allocation of replicas, correct?

Yes, that sounds correct. As 2 >= 2 +1 is not correct you need at least 3 data nodes.

It its a reasonably common misunderstanding. It is not always clear that an index with 3 primary shards and 1 replica will result in a total of 6 shards.

1 Like

Thank you!

@Christian_Dahlqvist / @tdasch
Does this mean that I should fire up one more node as I currently have 2?

Do I need this number of nodes given my small amount of data (probably not even 1000 documents all together)

In order to have high availability, you should always look to have at least 3 master-eligible nodes in the cluster.