Elastic has a large number os rejections search

guimap · June 25, 2021, 6:27pm

Hi guys

Recently my elasticsearch are returning 429 HTTP code for all _search requests, after few search, i noticed that has a large number of rejected search in my threads

node_name             name   active rejected completed
tiebreaker-0000000006 search      0        0         1
instance-0000000005   search      4    21351      7085
instance-0000000004   search      4    20447     11544

After few search, this error can one of reasons:

High use of JVM memory
Number of shards
Memory usage

This is my output of /_nodes/stats (i omitted some information)

gist.github.com

https://gist.github.com/guimap/e6a2a2e34f63de6f8eead942fc262c4f

ES_node-stats.json

{
  "_nodes": {
    "total": 3,
    "successful": 3,
    "failed": 0
  },
  "cluster_name": "my-cluster",
  "nodes": {
    "node-2": {
      "name": "node-2",

This file has been truncated. show original

My Cluster has 3 nodes,

Primary node has 1GB of ram
Second Node has 2GB of ram
Third Node has 2GB of ram

All JVM memory usage its under of 70%

i know this configuration is the minium required to a production enviroment

My question is:

I'm using scroll_id into my searches, this scrolls has a time expiration, if i have a long time expiration for these scroll, it could affect the search threads of my cluster ?

There's any motivation to the rejected number ?

warkolm · June 28, 2021, 12:19am

What is the output from the _cluster/stats?pretty&human API?

But your nodes are pretty small. Not sure where you read the min requirements for production, but I wouldn't be running a cluster on less than 4GB.

Christian_Dahlqvist · June 28, 2021, 6:28am

It can affect resource usage as it will prevent segment merging and can use a lot of heap. That will affect the cluster as a whole including search queries.

How many concurrent queries are you running? How many documents does each query return in total?

If heap usage looks OK and is not constantly very high you may be limited by CPU usage. Check your monitoring. As metioned by Mark you cluster is very small and could simply be overloaded. I would recommend increasing the size and see if that resolves the issue and is better suited to your udage pattern.

guimap · June 28, 2021, 12:02pm

sure, here is:

{
  "_nodes": {
    "total": 3,
    "successful": 3,
    "failed": 0
  },
  "cluster_name": "cluster",
  "timestamp": 1624881622786,
  "status": "yellow",
  "indices": {
    "count": 51,
    "shards": {
      "total": 104,
      "primaries": 52,
      "replication": 1.0,
      "index": {
        "shards": {
          "min": 2,
          "max": 4,
          "avg": 2.0392156862745097
        },
        "primaries": {
          "min": 1,
          "max": 2,
          "avg": 1.0196078431372548
        },
        "replication": {
          "min": 1.0,
          "max": 1.0,
          "avg": 1.0
        }
      }
    },
    "docs": {
      "count": 23387475,
      "deleted": 5017084
    },
    "store": {
      "size": "37.7gb",
      "size_in_bytes": 40518326002,
      "throttle_time": "0s",
      "throttle_time_in_millis": 0
    },
    "fielddata": {
      "memory_size": "15.6mb",
      "memory_size_in_bytes": 16370512,
      "evictions": 0
    },
    "query_cache": {
      "memory_size": "403.6mb",
      "memory_size_in_bytes": 423226936,
      "total_count": 3963931,
      "hit_count": 25033,
      "miss_count": 3938898,
      "cache_size": 2699,
      "cache_count": 2941,
      "evictions": 242
    },
    "completion": {
      "size": "0b",
      "size_in_bytes": 0
    },
    "segments": {
      "count": 368,
      "memory": "73.7mb",
      "memory_in_bytes": 77368608,
      "terms_memory": "66.3mb",
      "terms_memory_in_bytes": 69551687,
      "stored_fields_memory": "5.9mb",
      "stored_fields_memory_in_bytes": 6261064,
      "term_vectors_memory": "0b",
      "term_vectors_memory_in_bytes": 0,
      "norms_memory": "267.1kb",
      "norms_memory_in_bytes": 273600,
      "points_memory": "153.2kb",
      "points_memory_in_bytes": 156921,
      "doc_values_memory": "1mb",
      "doc_values_memory_in_bytes": 1125336,
      "index_writer_memory": "19.4mb",
      "index_writer_memory_in_bytes": 20423547,
      "version_map_memory": "4.7mb",
      "version_map_memory_in_bytes": 5021862,
      "fixed_bit_set": "11.8kb",
      "fixed_bit_set_memory_in_bytes": 12152,
      "max_unsafe_auto_id_timestamp": -1,
      "file_sizes": {
      }
    }
  },
  "nodes": {
    "count": {
      "total": 3,
      "data": 2,
      "coordinating_only": 0,
      "master": 3,
      "ingest": 2
    },
    "versions": [
      "5.6.16"
    ],
    "os": {
      "available_processors": 54,
      "allocated_processors": 6,
      "names": [
        {
          "name": "Linux",
          "count": 3
        }
      ],
      "mem": {
        "total": "342.2gb",
        "total_in_bytes": 367540596736,
        "free": "10.1gb",
        "free_in_bytes": 10935189504,
        "used": "332.1gb",
        "used_in_bytes": 356605407232,
        "free_percent": 3,
        "used_percent": 97
      }
    },
    "process": {
      "cpu": {
        "percent": 2
      },
      "open_file_descriptors": {
        "min": 389,
        "max": 542,
        "avg": 481
      }
    },
    "jvm": {
      "max_uptime": "2.8d",
      "max_uptime_in_millis": 244444954,
      "versions": [
        {
          "version": "1.8.0_144",
          "vm_name": "Java HotSpot(TM) 64-Bit Server VM",
          "vm_version": "25.144-b01",
          "vm_vendor": "Oracle Corporation",
          "count": 3
        }
      ],
      "mem": {
        "heap_used": "1.7gb",
        "heap_used_in_bytes": 1912075264,
        "heap_max": "4.3gb",
        "heap_max_in_bytes": 4675796992
      },
      "threads": 213
    },
    "fs": {
      "total": "242gb",
      "total_in_bytes": 259845521408,
      "free": "203.6gb",
      "free_in_bytes": 218711728128,
      "available": "203.6gb",
      "available_in_bytes": 218711728128
    },
    "plugins": [
      {
        "name": "repository-s3",
        "version": "5.6.16",
        "description": "The S3 repository plugin adds S3 repositories",
        "classname": "org.elasticsearch.repositories.s3.S3RepositoryPlugin",
        "has_native_controller": false
      },
      {
        "name": "x-pack",
        "version": "5.6.16",
        "description": "Elasticsearch Expanded Pack Plugin",
        "classname": "org.elasticsearch.xpack.XPackPlugin",
        "has_native_controller": true
      },
      {
        "name": "found-elasticsearch",
        "version": "5.6.16",
        "description": "Elasticsearch plugin for Found",
        "classname": "org.elasticsearch.plugin.found.FoundPlugin",
        "has_native_controller": false
      },
      {
        "name": "repository-gcs",
        "version": "5.6.16",
        "description": "The GCS repository plugin adds Google Cloud Storage support for repositories.",
        "classname": "org.elasticsearch.repositories.gcs.GoogleCloudStoragePlugin",
        "has_native_controller": false
      }
    ],
    "network_types": {
      "transport_types": {
      },
      "http_types": {
      }
    }
  }
}```

guimap · June 28, 2021, 12:11pm

I decreased expiration time to 5 minutes.

Looking to thread pool api, is 4 concurrent queries.

i have an size of 30 documents per request into my search query.

I did an upgrade to 8GB Ram, but the search request still blocked

guimap · June 28, 2021, 2:19pm

I found the probable reason,
I'm using elastic search 6.8, so i have a certain type mapping into my index,
On elastic search 7.0 they say the way type mapping works isn't perfomatic, because the way lucene stores these kind of data isn't the greatest.

i found a service in the cluster which using a lot of requests just to delete documents from this type ( which has 22M documents) using delete_by_query API.
Maybe i could search all the documents that i want to delete and then delete by ID (delete api?

so, Is there a better way to solve it ?

Note: all documents are related to other documents.

warkolm · June 28, 2021, 9:35pm

That version is no longer supported, please see our EOL matrix.
Please upgrade ASAP.

guimap · June 30, 2021, 7:39pm

This is exactly what i'm doing

guimap · July 1, 2021, 6:31pm

I found the problem, i has a service which has a huge number of requests using delete by query api, after stop it, the rejects number set to zero.

Thanks for help <3

system · July 29, 2021, 6:31pm

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Search Thread Rejected Elasticsearch	2	544	July 5, 2017
EsRejectedExecutionException Elasticsearch	4	452	July 6, 2017
EsRejectedExecutionException: rejected execution (queue capacity 1000) Elasticsearch	7	4627	July 6, 2017
Fixed search thread_pool, unbounded queue_size, but "rejected" increases Elasticsearch	2	997	July 5, 2017
Rejected execution exception for multiple parallel requests Elasticsearch	3	1444	July 6, 2017

Elastic has a large number os rejections search

Related topics