Shard limit hit

jmgarcia · March 12, 2018, 5:45pm

Well to get down to it we migraded servers this weekend. My script worked before but now it doesnt. I keep on getting an error that

"[{'error': {'root_cause': [{'type': 'illegal_argument_exception', 'reason': 'Trying to query 1536 shards, which is over the limit of 1000. This limit exists because querying many shards at the same time can make the job of the coordinating node very CPU and/or memory intensive. It is usually a better idea to have a smaller number of larger shards. Update [action.search.shard_count.limit] to a greater value if you really want to query that many shards at the same time.'}]"

I have been trying all day to get either action.search.shard_count.limit or max_concurrent_shard_requests working and cant quite figure it out. Here is a snip of my code...

{
"size": 0,
"query": {
"query_string": {
"query": fleCont
}
},
"aggs": {
"per_scott": {
"terms": {
"script": {
"lang": "painless",
"inline": "doc['src_ip'].value + ',' + doc['dst_ip'].value + ',' + doc['dst_port'].value + ',' + doc['proto'].value + ',' + doc['devicename'].value + ',' + doc['policy_id'].value"
},
"size": 10000,
}
}
}
}

pixelrebel · March 12, 2018, 5:56pm

Funny, I just hit this limit last week. It turned out I was querying way more indices than I needed. My query used a wildcard myindex-*. With this, I was querying months of data when I only needed the last thirty days. I changed my logic to account for the date. So now I'm searching myindex-2018.03.*,myindex-2018.02.*, and lightening the load on my es stack. This may not apply to your case, but I thought I'd share my anecdote just in case.

dadoonet · March 12, 2018, 6:00pm

That's definitely a good solution.

jmgarcia · March 12, 2018, 6:00pm

Hey thanks pixel but yeah, I need to search all of logstash. It is just strange why my admin would have reduced the allowed shard size

jmgarcia · March 12, 2018, 6:02pm

I agree it is but we have 200+ logstash logs to dig through. I think it would be more detrimental to the server to feed each logstash in, connect, search, disconnect etc than just upping the shard counts.

pixelrebel · March 12, 2018, 6:22pm

1000 is the default. Your admin either neglected to migrate this custom setting over, or more likely, your admin may have increased the number of shards per index on your new cluster.

Ideally, you will want to keep within the recommended settings and rewrite logic around the limitations. That said, I think I found a post with the answer you are looking for. Changing the limit is a cluster setting that will need to be configured by an administrator:

jmgarcia · March 12, 2018, 7:19pm

I am looking into how to rewrite my logic but I am lost. Do you have any suggestions?

dadoonet · March 12, 2018, 7:29pm

First thing is why using so many shards?

May I suggest you look at the following resources about sizing:

https://www.elastic.co/elasticon/conf/2016/sf/quantitative-cluster-sizing

jmgarcia · March 12, 2018, 7:39pm

I do not manage this server. I was told to "search Elasticsearch for X and get the results". So that is what I am doing. I dont mean to sound mean or apathetic but 1) This just started happening today (when Friday it wasnt -> yes we migrated over the weekend) and 2) I am at a disconnect ad being an admin on a elasticsearch database because I was literally thrown into this projected after I said I dont know it.

I will email the admin and see what they say but I feel like I need a little bit better ground to stand on first. I will watch the video and read some more but still, I am not sure I will have enough.

pixelrebel · March 12, 2018, 7:48pm

@dadoonet unfortunately I don't think OP is the admin, so tweaking his cluster may not be an option for him.

@jmgarcia how many indices are you hitting when you make your query? Are your indices historical as-in daily or monthly (i.e. myindex-2018.03)? Are you using wildcards in your index request (i.e. myindex-*)? If so, the workaround for this case would be something like this:

Establish the date range you need to query (i.e. last 24 months)
Divide this range into months (2018.03, 2018.02, 2018.01, 2017.12, etc)
Then join these months into acceptable chunks that fly under your limit:
query1: https://myeshost:9200/myindex-2018.01*,myindex-2017.12*
query2: https://myeshost:9200/myindex-2018.03*,myindex-2018.02*
....etc....
Finally, you would populate a master dict where you manually aggregate the multiple queries into a single search result.

jmgarcia · March 12, 2018, 7:52pm

You are correct @pixelrebel I am not an admin.

When I do a the cat command I get a total of 128 different logstash that I need to search through. They are moved to a different server every ~30 days and there is no way I will be able to access those servers.

Christian_Dahlqvist · March 12, 2018, 8:57pm

This limit on the number of shards that can be queried was introduced in Elasticsearch 5.x, which might have been what you might have migrated to over the weekend.

As David stated, it does look like you have a lot of shards. If you can provide the output of the cluster stats API we can get a better idea about the state of the cluster.

jmgarcia · March 13, 2018, 5:17pm

Alright, after more digging and talking with the admin, they changed the requirements so now I am not hitting the shard limit. A couple are close (I think one returns 989) but for now the issue is solved.

Right now I need to scrub the cluser stats too much @Christian_Dahlqvist and I dont have enough time to do that and post the results in here. Tomorrow or the following day I will though. In the future I know I will be running into this problem again so all I need is a little more time to sort other stuff out then we will be golden.

Thank you to everybody who helped.

Adios.... for now.

Christian_Dahlqvist · March 13, 2018, 5:19pm

The output of the cluster stats (not state) API typically does not contain anything particularly sensitive. What is it that needs to be scrubbed?

jmgarcia · March 14, 2018, 1:40pm

They dont want the server names or anything like it released, including the logstash specific names. I will do my best to get the results today.

Topic		Replies	Views
Too many shards Elasticsearch	8	10472	December 6, 2016
Action.search.shard_count.limit exceeded Elasticsearch	7	14067	March 29, 2017
How to increase shard size and limit Elasticsearch	1	5382	February 6, 2018
Sharding Issue Elasticsearch	12	645	October 23, 2018
Elasticsearch complaining about Shard limit Elasticsearch	4	379	January 6, 2021

Shard limit hit

Related topics