High resource usage of query with large term filter

We are running some rather large queries with about 6 thousand of term values in the filter sections.

The queries take 20 to 40 more seconds to run and much more CPU usage were observed when running with the large term filters portion.

ES cluster with 7 nodes for searching. Each with 32 core CPU, 128GB memory, 31GB JVM.
Each shard of index searched is about 30GB, around 160 shards in total over 7 nodes. Search result returns first 25 matches of around 3 million hits.

Here is an example of the query.

{
  "from": 0,
  "size": 25,
  "query": {
    "bool": {
      "must": [
        {
          "query_string": {
            "query": "Some query string",
            "fields": [],
            "type": "best_fields",
            "default_operator": "or",
            "max_determinized_states": 10000,
            "enable_position_increments": true,
            "fuzziness": "AUTO",
            "fuzzy_prefix_length": 0,
            "fuzzy_max_expansions": 50,
            "phrase_slop": 0,
            "escape": false,
            "auto_generate_synonyms_phrase_query": true,
            "fuzzy_transpositions": true,
            "boost": 1
          }
        }
      ],
      "filter": [
        {
          "terms": {
            "term_field_name": [
              "0WGJXlHtQSidU6cyt0rQrw", ... **# about 6 thousand of uniq values for this term**
             ],
           "boost": 1
          }
        }
      ],
      "adjust_pure_negative": true,
      "boost": 1
    }
  },
  "_source": {
    "includes": [],
    "excludes": []
  },
  "sort": [
    {
      "time": {
        "order": "desc",
        "missing": "_last",
        "unmapped_type": "date"
      }
    }
  ],
  "track_total_hits": 2147483647
}

As the CPU usages were high when the queries run, I have captured some hot threads result.

 100.4% [cpu=100.4%, other=0.0%] (502ms out of 500ms) cpu usage by thread 'elasticsearch[datanode-es-jetstream-std-ld1-bzvs.us-east1-c.c.svcstus-sedjetstream.internal][search][T#5]'
     2/10 snapshots sharing following 34 elements
       app//org.apache.lucene.search.ConjunctionDISI.doNext(ConjunctionDISI.java:211)
       app//org.apache.lucene.search.ConjunctionDISI.advance(ConjunctionDISI.java:242)
       app//org.apache.lucene.search.ConjunctionDISI.doNext(ConjunctionDISI.java:211)
       app//org.apache.lucene.search.ConjunctionDISI.nextDoc(ConjunctionDISI.java:253)
       app//org.apache.lucene.search.ConjunctionDISI$BitSetConjunctionDISI.doNext(ConjunctionDISI.java:314)
       app//org.apache.lucene.search.ConjunctionDISI$BitSetConjunctionDISI.advance(ConjunctionDISI.java:310)
       app//org.apache.lucene.search.Weight$StartDISIWrapper.advance(Weight.java:322)
       app//org.apache.lucene.search.Weight$StartDISIWrapper.nextDoc(Weight.java:314)
       app//org.apache.lucene.search.ConjunctionDISI.nextDoc(ConjunctionDISI.java:253)
       app//org.apache.lucene.search.Weight$DefaultBulkScorer.scoreRange(Weight.java:268)
       app//org.apache.lucene.search.Weight$DefaultBulkScorer.score(Weight.java:245)
       app//org.elasticsearch.search.internal.CancellableBulkScorer.score(CancellableBulkScorer.java:45)
       app//org.apache.lucene.search.BulkScorer.score(BulkScorer.java:39)
       app//org.elasticsearch.search.internal.ContextIndexSearcher.searchLeaf(ContextIndexSearcher.java:194)
       app//org.elasticsearch.search.internal.ContextIndexSearcher.search(ContextIndexSearcher.java:167)
       app//org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:443)
       app//org.elasticsearch.search.query.QueryPhase.searchWithCollector(QueryPhase.java:255)
       app//org.elasticsearch.search.query.QueryPhase.executeInternal(QueryPhase.java:212)
       app//org.elasticsearch.search.query.QueryPhase.execute(QueryPhase.java:98)
       app//org.elasticsearch.search.SearchService.loadOrExecuteQueryPhase(SearchService.java:458)
       app//org.elasticsearch.search.SearchService.executeQueryPhase(SearchService.java:622)
       app//org.elasticsearch.search.SearchService.lambda$executeQueryPhase$2(SearchService.java:483)
       app//org.elasticsearch.search.SearchService$$Lambda$6908/0x0000000801c08430.get(Unknown Source)
       app//org.elasticsearch.search.SearchService$$Lambda$6910/0x0000000801c08a60.get(Unknown Source)
       app//org.elasticsearch.action.ActionRunnable.lambda$supply$0(ActionRunnable.java:47)
       app//org.elasticsearch.action.ActionRunnable$$Lambda$6915/0x0000000801c092a0.accept(Unknown Source)
       app//org.elasticsearch.action.ActionRunnable$2.doRun(ActionRunnable.java:62)
       app//org.elasticsearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:26)
       app//org.elasticsearch.common.util.concurrent.TimedRunnable.doRun(TimedRunnable.java:33)
       app//org.elasticsearch.common.util.concurrent.ThreadContext$ContextPreservingAbstractRunnable.doRun(ThreadContext.java:777)
       app//org.elasticsearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:26)
       java.base@18/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1136)
       java.base@18/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635)
       java.base@18/java.lang.Thread.run(Thread.java:833)
     2/10 snapshots sharing following 40 elements
       app//org.apache.lucene.store.DataInput.readVInt(DataInput.java:129)
       app//org.apache.lucene.codecs.lucene84.Lucene84PostingsReader.readVIntBlock(Lucene84PostingsReader.java:139)
       app//org.apache.lucene.codecs.lucene84.Lucene84PostingsReader$BlockDocsEnum.refillDocs(Lucene84PostingsReader.java:441)
       app//org.apache.lucene.codecs.lucene84.Lucene84PostingsReader$BlockDocsEnum.nextDoc(Lucene84PostingsReader.java:454)
       app//org.apache.lucene.util.BitSet.or(BitSet.java:95)
       app//org.apache.lucene.util.FixedBitSet.or(FixedBitSet.java:271)
       app//org.apache.lucene.util.DocIdSetBuilder.add(DocIdSetBuilder.java:151)
       app//org.apache.lucene.search.TermInSetQuery$1.rewrite(TermInSetQuery.java:287)
  app//org.apache.lucene.search.TermInSetQuery$1.scorer(TermInSetQuery.java:350)
       app//org.apache.lucene.search.Weight.scorerSupplier(Weight.java:148)
       app//org.apache.lucene.search.LRUQueryCache$CachingWrapperWeight.scorerSupplier(LRUQueryCache.java:726)
       app//org.elasticsearch.indices.IndicesQueryCache$CachingWeightWrapper.scorerSupplier(IndicesQueryCache.java:159)
       app//org.apache.lucene.search.BooleanWeight.scorerSupplier(BooleanWeight.java:379)
       app//org.apache.lucene.search.BooleanWeight.scorer(BooleanWeight.java:344)
       app//org.apache.lucene.search.Weight.bulkScorer(Weight.java:182)
       app//org.apache.lucene.search.BooleanWeight.bulkScorer(BooleanWeight.java:338)
       app//org.apache.lucene.search.LRUQueryCache$CachingWrapperWeight.bulkScorer(LRUQueryCache.java:837)
       app//org.elasticsearch.indices.IndicesQueryCache$CachingWeightWrapper.bulkScorer(IndicesQueryCache.java:165)
       app//org.elasticsearch.search.internal.ContextIndexSearcher$1.bulkScorer(ContextIndexSearcher.java:244)
       app//org.elasticsearch.search.internal.ContextIndexSearcher.searchLeaf(ContextIndexSearcher.java:191)
       app//org.elasticsearch.search.internal.ContextIndexSearcher.search(ContextIndexSearcher.java:167)
       app//org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:443)
       app//org.elasticsearch.search.query.QueryPhase.searchWithCollector(QueryPhase.java:255)
       app//org.elasticsearch.search.query.QueryPhase.executeInternal(QueryPhase.java:212)
       app//org.elasticsearch.search.query.QueryPhase.execute(QueryPhase.java:98)
       app//org.elasticsearch.search.SearchService.loadOrExecuteQueryPhase(SearchService.java:458)
       app//org.elasticsearch.search.SearchService.executeQueryPhase(SearchService.java:622)
       app//org.elasticsearch.search.SearchService.lambda$executeQueryPhase$2(SearchService.java:483)
       app//org.elasticsearch.search.SearchService$$Lambda$6908/0x0000000801c08430.get(Unknown Source)
       app//org.elasticsearch.search.SearchService$$Lambda$6910/0x0000000801c08a60.get(Unknown Source)
       app//org.elasticsearch.action.ActionRunnable.lambda$supply$0(ActionRunnable.java:47)
       app//org.elasticsearch.action.ActionRunnable$$Lambda$6915/0x0000000801c092a0.accept(Unknown Source)
       app//org.elasticsearch.action.ActionRunnable$2.doRun(ActionRunnable.java:62)
       app//org.elasticsearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:26)
       app//org.elasticsearch.common.util.concurrent.TimedRunnable.doRun(TimedRunnable.java:33)
       app//org.elasticsearch.common.util.concurrent.ThreadContext$ContextPreservingAbstractRunnable.doRun(ThreadContext.java:777)
       app//org.elasticsearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:26)
       java.base@18/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1136)
       java.base@18/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635)
       java.base@18/java.lang.Thread.run(Thread.java:833)
     2/10 snapshots sharing following 31 elements
       app//org.apache.lucene.search.ConjunctionDISI.nextDoc(ConjunctionDISI.java:253)
       app//org.apache.lucene.search.ConjunctionDISI$BitSetConjunctionDISI.doNext(ConjunctionDISI.java:314)
       app//org.apache.lucene.search.ConjunctionDISI$BitSetConjunctionDISI.advance(ConjunctionDISI.java:310)
       app//org.apache.lucene.search.Weight$StartDISIWrapper.advance(Weight.java:322)
       app//org.apache.lucene.search.Weight$StartDISIWrapper.nextDoc(Weight.java:314)
       app//org.apache.lucene.search.ConjunctionDISI.nextDoc(ConjunctionDISI.java:253)
       app//org.apache.lucene.search.Weight$DefaultBulkScorer.scoreRange(Weight.java:268)
       app//org.apache.lucene.search.Weight$DefaultBulkScorer.score(Weight.java:245)
       app//org.elasticsearch.search.internal.CancellableBulkScorer.score(CancellableBulkScorer.java:45)
       app//org.apache.lucene.search.BulkScorer.score(BulkScorer.java:39)
       app//org.elasticsearch.search.internal.ContextIndexSearcher.searchLeaf(ContextIndexSearcher.java:194)
       app//org.elasticsearch.search.internal.ContextIndexSearcher.search(ContextIndexSearcher.java:167)
       app//org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:443)
       app//org.elasticsearch.search.query.QueryPhase.searchWithCollector(QueryPhase.java:255)
       app//org.elasticsearch.search.query.QueryPhase.executeInternal(QueryPhase.java:212)
       app//org.elasticsearch.search.query.QueryPhase.execute(QueryPhase.java:98)
       app//org.elasticsearch.search.SearchService.loadOrExecuteQueryPhase(SearchService.java:458)
       app//org.elasticsearch.search.SearchService.executeQueryPhase(SearchService.java:622)
       app//org.elasticsearch.search.SearchService.lambda$executeQueryPhase$2(SearchService.java:483)
       app//org.elasticsearch.search.SearchService$$Lambda$6908/0x0000000801c08430.get(Unknown Source)
       app//org.elasticsearch.search.SearchService$$Lambda$6910/0x0000000801c08a60.get(Unknown Source)
       app//org.elasticsearch.action.ActionRunnable.lambda$supply$0(ActionRunnable.java:47)
       app//org.elasticsearch.action.ActionRunnable$$Lambda$6915/0x0000000801c092a0.accept(Unknown Source)
       app//org.elasticsearch.action.ActionRunnable$2.doRun(ActionRunnable.java:62)
...
100.3% [cpu=100.3%, other=0.0%] (501.6ms out of 500ms) cpu usage by thread 'elasticsearch[HOST_NAME_HERE][search][T#3]'
     2/10 snapshots sharing following 29 elements
       app//org.apache.lucene.search.ConjunctionDISI.advance(ConjunctionDISI.java:242)
       app//org.apache.lucene.search.Weight$StartDISIWrapper.advance(Weight.java:322)
       app//org.apache.lucene.search.Weight$StartDISIWrapper.nextDoc(Weight.java:314)
       app//org.apache.lucene.search.ConjunctionDISI.nextDoc(ConjunctionDISI.java:253)
       app//org.apache.lucene.search.Weight$DefaultBulkScorer.scoreRange(Weight.java:268)
       app//org.apache.lucene.search.Weight$DefaultBulkScorer.score(Weight.java:245)
       app//org.elasticsearch.search.internal.CancellableBulkScorer.score(CancellableBulkScorer.java:45)
       app//org.apache.lucene.search.BulkScorer.score(BulkScorer.java:39)
       app//org.elasticsearch.search.internal.ContextIndexSearcher.searchLeaf(ContextIndexSearcher.java:194)
       app//org.elasticsearch.search.internal.ContextIndexSearcher.search(ContextIndexSearcher.java:167)
       app//org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:443)
       app//org.elasticsearch.search.query.QueryPhase.searchWithCollector(QueryPhase.java:255)
       app//org.elasticsearch.search.query.QueryPhase.executeInternal(QueryPhase.java:212)
       app//org.elasticsearch.search.query.QueryPhase.execute(QueryPhase.java:98)
       app//org.elasticsearch.search.SearchService.loadOrExecuteQueryPhase(SearchService.java:458)
       app//org.elasticsearch.search.SearchService.executeQueryPhase(SearchService.java:622)
       app//org.elasticsearch.search.SearchService.lambda$executeQueryPhase$2(SearchService.java:483)
       app//org.elasticsearch.search.SearchService$$Lambda$6833/0x0000000801bee838.get(Unknown Source)
       app//org.elasticsearch.search.SearchService$$Lambda$6836/0x0000000801beea48.get(Unknown Source)
       app//org.elasticsearch.action.ActionRunnable.lambda$supply$0(ActionRunnable.java:47)
       app//org.elasticsearch.action.ActionRunnable$$Lambda$6840/0x0000000801bef288.accept(Unknown Source)
       app//org.elasticsearch.action.ActionRunnable$2.doRun(ActionRunnable.java:62)
       app//org.elasticsearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:26)
       app//org.elasticsearch.common.util.concurrent.TimedRunnable.doRun(TimedRunnable.java:33)
       app//org.elasticsearch.common.util.concurrent.ThreadContext$ContextPreservingAbstractRunnable.doRun(ThreadContext.java:777)
       app//org.elasticsearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:26)
       java.base@18/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1136)
       java.base@18/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635)
       java.base@18/java.lang.Thread.run(Thread.java:833)
     2/10 snapshots sharing following 25 elements
       app//org.apache.lucene.search.Weight$DefaultBulkScorer.scoreRange(Weight.java:265)
       app//org.apache.lucene.search.Weight$DefaultBulkScorer.score(Weight.java:245)
       app//org.elasticsearch.search.internal.CancellableBulkScorer.score(CancellableBulkScorer.java:45)
       app//org.apache.lucene.search.BulkScorer.score(BulkScorer.java:39)
       app//org.elasticsearch.search.internal.ContextIndexSearcher.searchLeaf(ContextIndexSearcher.java:194)
       app//org.elasticsearch.search.internal.ContextIndexSearcher.search(ContextIndexSearcher.java:167)
       app//org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:443)
       app//org.elasticsearch.search.query.QueryPhase.searchWithCollector(QueryPhase.java:255)
       app//org.elasticsearch.search.query.QueryPhase.executeInternal(QueryPhase.java:212)
       app//org.elasticsearch.search.query.QueryPhase.execute(QueryPhase.java:98)
       app//org.elasticsearch.search.SearchService.loadOrExecuteQueryPhase(SearchService.java:458)
       app//org.elasticsearch.search.SearchService.executeQueryPhase(SearchService.java:622)
       app//org.elasticsearch.search.SearchService.lambda$executeQueryPhase$2(SearchService.java:483)
       app//org.elasticsearch.search.SearchService$$Lambda$6833/0x0000000801bee838.get(Unknown Source)
       app//org.elasticsearch.search.SearchService$$Lambda$6836/0x0000000801beea48.get(Unknown Source)
       app//org.elasticsearch.action.ActionRunnable.lambda$supply$0(ActionRunnable.java:47)
       app//org.elasticsearch.action.ActionRunnable$$Lambda$6840/0x0000000801bef288.accept(Unknown Source)
       app//org.elasticsearch.action.ActionRunnable$2.doRun(ActionRunnable.java:62)
       app//org.elasticsearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:26)
       app//org.elasticsearch.common.util.concurrent.TimedRunnable.doRun(TimedRunnable.java:33)
 app//org.elasticsearch.common.util.concurrent.ThreadContext$ContextPreservingAbstractRunnable.doRun(ThreadContext.java:777)
       app//org.elasticsearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:26)
       java.base@18/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1136)
       java.base@18/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635)
       java.base@18/java.lang.Thread.run(Thread.java:833)
     2/10 snapshots sharing following 38 elements
       app//org.apache.lucene.codecs.blocktree.SegmentTermsEnumFrame.scanToTermLeaf(SegmentTermsEnumFrame.java:638)
       app//org.apache.lucene.codecs.blocktree.SegmentTermsEnumFrame.scanToTerm(SegmentTermsEnumFrame.java:549)
       app//org.apache.lucene.codecs.blocktree.SegmentTermsEnum.seekExact(SegmentTermsEnum.java:511)
       app//org.apache.lucene.index.FilterLeafReader$FilterTermsEnum.seekExact(FilterLeafReader.java:184)
       app//org.apache.lucene.index.FilterLeafReader$FilterTermsEnum.seekExact(FilterLeafReader.java:184)
       app//org.apache.lucene.search.TermInSetQuery$1.rewrite(TermInSetQuery.java:284)
       app//org.apache.lucene.search.TermInSetQuery$1.scorer(TermInSetQuery.java:350)
       app//org.apache.lucene.search.Weight.scorerSupplier(Weight.java:148)
       app//org.apache.lucene.search.LRUQueryCache$CachingWrapperWeight.scorerSupplier(LRUQueryCache.java:726)
       app//org.elasticsearch.indices.IndicesQueryCache$CachingWeightWrapper.scorerSupplier(IndicesQueryCache.java:159)
       app//org.apache.lucene.search.BooleanWeight.scorerSupplier(BooleanWeight.java:379)
       app//org.apache.lucene.search.BooleanWeight.scorer(BooleanWeight.java:344)
       app//org.apache.lucene.search.Weight.bulkScorer(Weight.java:182)
       app//org.apache.lucene.search.BooleanWeight.bulkScorer(BooleanWeight.java:338)
       app//org.apache.lucene.search.LRUQueryCache$CachingWrapperWeight.bulkScorer(LRUQueryCache.java:837)
       app//org.elasticsearch.indices.IndicesQueryCache$CachingWeightWrapper.bulkScorer(IndicesQueryCache.java:165)
       app//org.elasticsearch.search.internal.ContextIndexSearcher$1.bulkScorer(ContextIndexSearcher.java:244)
       app//org.elasticsearch.search.internal.ContextIndexSearcher.searchLeaf(ContextIndexSearcher.java:191)
       app//org.elasticsearch.search.internal.ContextIndexSearcher.search(ContextIndexSearcher.java:167)
       app//org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:443)
       app//org.elasticsearch.search.query.QueryPhase.searchWithCollector(QueryPhase.java:255)
       app//org.elasticsearch.search.query.QueryPhase.executeInternal(QueryPhase.java:212)
       app//org.elasticsearch.search.query.QueryPhase.execute(QueryPhase.java:98)
       app//org.elasticsearch.search.SearchService.loadOrExecuteQueryPhase(SearchService.java:458)
       app//org.elasticsearch.search.SearchService.executeQueryPhase(SearchService.java:622)
       app//org.elasticsearch.search.SearchService.lambda$executeQueryPhase$2(SearchService.java:483)
       app//org.elasticsearch.search.SearchService$$Lambda$6833/0x0000000801bee838.get(Unknown Source)
       app//org.elasticsearch.search.SearchService$$Lambda$6836/0x0000000801beea48.get(Unknown Source)
       app//org.elasticsearch.action.ActionRunnable.lambda$supply$0(ActionRunnable.java:47)
       app//org.elasticsearch.action.ActionRunnable$$Lambda$6840/0x0000000801bef288.accept(Unknown Source)
       app//org.elasticsearch.action.ActionRunnable$2.doRun(ActionRunnable.java:62)
       app//org.elasticsearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:26)
       app//org.elasticsearch.common.util.concurrent.TimedRunnable.doRun(TimedRunnable.java:33)
       app//org.elasticsearch.common.util.concurrent.ThreadContext$ContextPreservingAbstractRunnable.doRun(ThreadContext.java:777)
       app//org.elasticsearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:26)
       java.base@18/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1136)
       java.base@18/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635)
       java.base@18/java.lang.Thread.run(Thread.java:833)
     4/10 snapshots sharing following 20 elements
       app//org.elasticsearch.search.internal.ContextIndexSearcher.search(ContextIndexSearcher.java:167)
       app//org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:443)
       app//org.elasticsearch.search.query.QueryPhase.searchWithCollector(QueryPhase.java:255)
       app//org.elasticsearch.search.query.QueryPhase.executeInternal(QueryPhase.java:212)
       app//org.elasticsearch.search.query.QueryPhase.execute(QueryPhase.java:98)
       app//org.elasticsearch.search.SearchService.loadOrExecuteQueryPhase(SearchService.java:458)
       app//org.elasticsearch.search.SearchService.executeQueryPhase(SearchService.java:622)
       app//org.elasticsearch.search.SearchService.lambda$executeQueryPhase$2(SearchService.java:483)
       app//org.elasticsearch.search.SearchService$$Lambda$6833/0x0000000801bee838.get(Unknown Source)
       app//org.elasticsearch.search.SearchService$$Lambda$6836/0x0000000801beea48.get(Unknown Source)
       app//org.elasticsearch.action.ActionRunnable.lambda$supply$0(ActionRunnable.java:47)
....

I have read through related topic and tried a few things as suggested on the forum, but none of them have provided much performance gains so far.

Attempt 1: Use constant score query like below to skip the scoring part.

{
  "from": 0,
  "size": 25,
  "query": {
    "constant_score": {
      "filter": {
        "bool": {
          "must": [
            {
              "terms": {
                "term_field_name": [
                  "18xHaqUQyuzlbEcZlOaaQ", # 6K more of similar values
                  ] 
               }
            },
             "boost": 1
              }
            },
            {
              "query_string": {
                "query": "Some query string",
                "fields": [

                ],
                "type": "best_fields",
                "default_operator": "or",
                "max_determinized_states": 10000,
                "enable_position_increments": true,
                "fuzziness": "AUTO",
                "fuzzy_prefix_length": 0,
                "fuzzy_max_expansions": 50,
                "phrase_slop": 0,
                "escape": false,
                "auto_generate_synonyms_phrase_query": true,
                "fuzzy_transpositions": true,
                "boost": 1
              }
            }
          ]
        }
      }
    }
  },
  "_source": {
    "includes": [

    ],
    "excludes": [

    ]
  },
  "sort": [
    {
      "time": {
        "order": "desc",
        "missing": "_last",
        "unmapped_type": "date"
      }
    }
  ],
  "track_total_hits": 2147483647
}
            

Attempt 2: Break large 6K filter section into smaller chunks

{
  "from": 0,
  "size": 25,
  "query": {
    "bool": {
      "minimum_should_match": 1,
      "should": [
        {
          "terms": {
            "term_field_name": [
              "-18xHaqUQyuzlbEcZlOaaQ",
              ... # up to 16 of such values
            ]
          }
        },
         {
          "terms": {
            "term_field_name": [
              "ababaabb",
              ... # up to 16 of such values
            ]
          }
        },
        # more of them, combined value size is 6K ish
        must": [
        {
          "query_string": {
            "query": "Some Query",
            "fields": [],
            "type": "best_fields",
            "default_operator": "or",
            "max_determinized_states": 10000,
            "enable_position_increments": true,
            "fuzziness": "AUTO",
            "fuzzy_prefix_length": 0,
            "fuzzy_max_expansions": 50,
            "phrase_slop": 0,
            "escape": false,
            "auto_generate_synonyms_phrase_query": true,
            "fuzzy_transpositions": true,
            "boost": 1
          }
        }
      ],
      "adjust_pure_negative": true,
      "boost": 1
    }
  },
  "_source": {
    "includes": [],
    "excludes": []
  },
  "sort": [
    {
      "time": {
        "order": "desc",
        "missing": "_last",
        "unmapped_type": "date"
      }
    }
  ],
  "track_total_hits": 2147483647
}

Could you please suggest anything else we can try?

Correction for " Attempt 2: Break large 6K filter section into smaller chunks" above.

The query tried was also wrapped in a constant_score filter like this

{
  "from": 0,
  "size": 25,
  "query": {
    "constant_score": {
      "filter": {
        "bool": {
          "minimum_should_match": 1,
          "should": [
            {
              "terms": {
                "term_field_name": [
                  "-18xHaqUQyuzlbEcZlOaaQ", # up to 16 of such values
                ]
              }
            },
             {
              "terms": {
                "term_field_name": [
                  "aabbcc", # up to 16 of such values
                ]
              }
            },
        ],
          "must": [
            {
              "query_string": {
                "query": "Some query String",
                "fields": [],
                "type": "best_fields",
                "default_operator": "or",
                "max_determinized_states": 10000,
                "enable_position_increments": true,
                "fuzziness": "AUTO",
                "fuzzy_prefix_length": 0,
                "fuzzy_max_expansions": 50,
                "phrase_slop": 0,
                "escape": false,
                "auto_generate_synonyms_phrase_query": true,
                "fuzzy_transpositions": true,
                "boost": 1
              }
            }
          ]
        }
      }
....


An alternative approach is benchmarked in this issue - links in that issue take you to some example queries which you can try on your dataset to see if that helps.

Seems the proposed solution is to use script based filtering instead of terms based filtering for large set.

My impression was script based solution is often slower, but surprised to see that better performance for script based solution for set size larger than 12K in his test, but worse for smaller sets.

The unsurprising thing is random disk seeks are slow so looking up thousands of terms in an index and their matching doc ids will always be slow for a search.
Sometimes however, retrieving doc terms from a doc-values disk structure in a more linear fashion to see which ones are in an in-memory hashmap of search values works out faster.

Performance depends on how many matching docs and how many unique search terms you have.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.