Bool filter performance and its alternatives

Hi everyone,

I have been profiling a query that is noticeably slow in our ES setup.

Setup:
AWS, ES 7.3, 11 data nodes, 1 index, 22 shards total, total # of docs: ~350M

We store documents like this (~20 fields and most of them have a numeric type):

    {...
    "id1": long,
    "id2": long,
    status: byte # contains a number from [1..20]
    ...
    }

The following query gives us consistently high latency (~500ms-1second):

    {
      "query":{
        "bool":{
          "filter":[
            {
              "term":{
                "id1":{
                  "value":<id1>
                }
              }
            },
            {
              "term":{
                "id2":{
                  "value":<id2>
                }
              }
            },
            {
              "terms":{
                "status":[
                  1,
                  2,
                  3,
                  4,
                  5,
                  6,
                  7,
                  8,
                  9,
                  10,
                  11,
                  12,
                  13,
                  14,
                  15,
                  16,
                  17,
                  18,
                  19,
                  20
                ]
              }
            }
          ]
        }
      }
    }

# of hits: 22 documents

Profiling this query shows the majority of time is spent on the 3rd term (build_scorer to be more precise):
1 ms on 1st and 2nd terms and 500+ ms on the third one.

    {
      "type" : "PointInSetQuery",
      "description" : "status:{1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20}",
      "time" : "513.9ms",
      "time_in_nanos" : 513960388,
      "breakdown" : {
        "score" : 0,
        "build_scorer_count" : 78,
        "match_count" : 0,
        "create_weight" : 1483,
        "next_doc" : 0,
        "match" : 0,
        "create_weight_count" : 1,
        "next_doc_count" : 0,
        "score_count" : 0,
        "build_scorer" : 513924033,
        "advance" : 34767,
        "advance_count" : 26
      }
    }

This example is artificial and intentionally exaggerated since if we remove the 3rd term this query still returns 22 documents as we enumerated all possible values ([1..20]) of "status" field. However, based on what I've read the 3rd term in theory should've run against 22 documents only. I've tried different ways to express the same query, e.g., moved the 3rd term into "must", changed order and etc. The result is usually the same or even worse. If I leave just a subset of "status" values to compare against, the latency is slightly better but still at least 2 orders of magnitude higher than I'd expect.

Any help or pointers would be really appreciated!

Thanks.