I got a 503 response when using _rollover API. Here is the response.
{
"error": {
"root_cause": [
{
"type": "process_cluster_event_timeout_exception",
"reason": "failed to process cluster event (create-index [index-000002], cause [rollover_index]) within 30s"
}
],
"type": "process_cluster_event_timeout_exception",
"reason": "failed to process cluster event (create-index [index-000002], cause [rollover_index]) within 30s"
},
"status": 503
}
However, when I checked kibana, the index was actually created at that time, but the alias was still pointing at the old index.
I thought the _rollover is an atomic operation, first create the index and then change the alias.. But based on the log, it doesn't seem so.
Elasticsearch version: 6.7
These kind of changes require the cluster state to be updated, and that should not time out like that, which indicates that something either is wrong in your cluster or that it is under heavy load.
How large is your cluster? How many indices and shards do you have in it? What kind of hardware is the cluster deployed on? What kind of storage is used?
It's a three-nodes cluster on AWS. 6126 indices and 15391 shards. One node was down and the cluster went to red for like 15 minutes at that time. I think that's the reason. Thank you for the help.
You have far, far too many shards for a cluster that size, which will have a negative impact on performance and stability. Please read this blog post for guidance and try to reduce this dramatically.
I reduced it to 3k indices and 9k shards. I know it is still too many comparing to " 20 per GB heap" rule in this blog post. But my business logic requires each user has an index each month, because different user level has a different allowed storage limit. This may not be the best practice to use elasticsearch, but do you have any suggestions for this case?
Why couldn't you combine all users into 1 index? Use user ID to differentiate them.
That's how our business is using it. And we also have monthly rollover index.
I want to have control on the storage space used by users, each user has a different limit. When the size exceeds its limit, I can delete the indices. This can also prevent logs flooding.
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.