Elasticsearch query timeouts on data stream (50M–170M docs, facets + NOT queries)

Anup_Kumar · January 20, 2026, 12:49pm

Hi all,

I’m facing Elasticsearch query timeouts for a search workload stored in a data stream (immutable data, bulk indexed historical data). We are using Point-in-Time (PIT) searches for query consistency/pagination.

Backing index examples

One large backing index:

PRI=10, REP=2
docs.count = 170,461,279
store.size = 743.3GB
pri.store.size = 247.4GB

Another backing index with fewer docs also times out:

PRI=10, REP=2
docs.count = 49,764,512 (example)
still seeing query timeouts

Rollover setup

ILM rollover configured:

max_primary_shard_size = 30GB
max_size = 300GB

Rollover did not happen for the 170M index because shard sizes and total primary size are still below these thresholds.

Query pattern

Typical query returns top 25 results and often includes facets:

{
  "searchText": "NOT ABC",
  "top": 25,
  "includeFacets": true,
  "filters": { "projectName": ["X"] }
}

We execute searches using Point-in-Time (PIT) for consistent pagination.

Queries may include facets via aggregations

Filters are applied using post_filter, and pagination is done using from/size (Skip/Take).

Note: We are currently not using custom routing. We are relying on Elasticsearch’s default routing behavior for data streams (routing based on document _id), so searches may fan out across all primary shards/backing indices.

Questions

What are the most common causes of search timeouts on indices with ~50M documents (10 primary shards), even when requesting only the top N results (e.g., 25)?
What are recommended ways to optimize queries that include facets/aggregations (and negative terms like NOT), to reduce latency and prevent timeouts?
For large historical backfills into a data stream, what rollover strategy is recommended should we rely more on max_docs, max_primary_shard_size, or a combination of both?
Should we use custom routing, or we can rely on default routing for different project/groups.

Christian_Dahlqvist · January 20, 2026, 2:30pm

Which version of Elasticsearch are you using?

What is the size and specification of the cluster in terms of CPU, RAM and type of storage used?

What does the mappings for the index look like?

What does a sample query with aggregations look like?

What query latencies are you experiencing? Does this occur when you issue just a single query agaist the cluster or does it require a number of concurrent queries?

Do you always filetr on project name? If so, how many different project names are there in the index?

Anup_Kumar · January 20, 2026, 5:18pm

Hi @Christian_Dahlqvist Thank you for your response,
Please find the requested information below
Elasticsearch version

Elasticsearch 8.14

Cluster specs

Data nodes: 8 vCPUs, 64 GiB RAM, SSD 2 TB
Query/coord nodes: 8 vCPUs, 64 GiB RAM

Observed latency / timeouts

Many searches timeout at ~60 seconds
It is more likely to timeout when we query at group level (group contains many projects)
Queries sometimes work for smaller repos, but most fail for larger scopes
We see timeouts even with relatively small result size (top 25)
This can happen even with a single query, but is worse under higher request volume (concurrent traffic)

Filters / selectivity

We usually apply group-level and/or project-level filters
There are many projects, but limited groups

Sample Request:
// slightly modified not the actual request

Summary

{
"searchFilters": {
"projectName": [ "SampleProject" ],
"projectIdentifier": [ "sample-project-id" ],
},
"options": [ "Faceting", "Highlighting" ],
"skipResults": 0,
"takeResults": 50,
"orderBy": [
{ "Field": "eventDate", "SortOrder": "Desc" }
],
"fields": [
"projectIdentifier",
"projectName",
"identifier",
"recordType",
"itemCategory",
"eventId",
"eventTitle",
"eventDescription",
"eventDate",
"authorName",
"authorEmail",
"authorDate",
"performerName",
"performerEmail",
"indexedTimestamp",
"actionDate",
"@timestamptimestamp"
],
"highlightFields": [
"eventTitle",
"eventDescription",
"authorName",
"performerName"
],
"terminateAfter": 0,
"keepAlive": "1m",
"pitId": null,
"continueOnEmptyQuery": false,
"searchAfter": null,
"scopeFiltersExpression": {
"type": "And",
"children": [
{
"type": "Term",
"field": "recordType",
"operator": "Equals",
"value": "sample-record-type"
}
]
},
"queryParseTree": {
"type": "Term",
"field": "eventTitle",
"operator": "Contains",
"value": "test"
}
}

**
Index mapping**

Below is a partial mapping snippet (data stream enabled). Key points:

@timestamp, authorDate, eventDate are date (epoch_second)
Many searchable fields are text with analyzers
Some fields also have keyword subfields (raw) with eager_global_ordinals=true

{
  ".ds-datastream_XX_8140-bc49432d81a7-2026.01.20-000001": {
    "mappings": {
      "_meta": {
        "version": 8140
      },
      "_data_stream_timestamp": {
        "enabled": true
      },
      "properties": {
        "@timestamp": {
          "type": "date",
          "format": "epoch_second"
        },
        "authorDate": {
          "type": "date",
          "format": "epoch_second"
        },
        "authorEmail": {
          "type": "text",
          "fields": {
            "raw": {
              "type": "keyword",
              "eager_global_ordinals": true
            }
          },
          "index_options": "offsets",
          "analyzer": "unstemmedFullTextAnalyzer"
        },
        "authorName": {
          "type": "text",
          "fields": {
            "pattern": {
              "type": "text",
              "index_options": "offsets",
              "analyzer": "contentAnalyzer"
            },
            "raw": {
              "type": "keyword",
              "eager_global_ordinals": true
            }
          },
          "norms": false,
          "analyzer": "LowerCaseAnalyzer"
        },
        "entityId": {
          "type": "text"
        },
        "entityName": {
          "type": "text",
          "fields": {
            "raw": {
              "type": "keyword"
            }
          },
          "norms": false,
          "analyzer": "LowerCaseAnalyzer"
        },
        "entityNameOriginal": {
          "type": "text"
        },
        "eventDate": {
          "type": "date",
          "format": "epoch_second"
        },
        "eventDescription": {
          "type": "text",
          "index_options": "offsets",
          "norms": false,
          "analyzer": "contentAnalyzer"
        },
        "eventId": {
          "type": "text"
        },
        "eventTitle": {
          "type": "text",
          "index_options": "offsets",
          "norms": false,
          "analyzer": "contentAnalyzer"
        },
        "performerEmail": {
          "type": "text",
          "fields": {
            "raw": {
              "type": "keyword",
              "eager_global_ordinals": true
            }
          },
          "index_options": "offsets",
          "analyzer": "unstemmedFullTextAnalyzer"
        },
        "performerName": {
          "type": "text",
          "fields": {
            "pattern": {
              "type": "text",
              "index_options": "offsets",
              "analyzer": "contentAnalyzer"
            },
            "raw": {
              "type": "keyword",
              "eager_global_ordinals": true
            }
          },
          "index_options": "offsets",
          "analyzer": "unstemmedFullTextAnalyzer"
        },
        "recordType": {
          "type": "text"
        },
        "identifier": {
          "type": "text",
          "fields": {
            "keyword": {
              "type": "keyword",
              "ignore_above": 256
            }
          }
        },
        "indexedTimestamp": {
          "type": "date",
          "format": "epoch_second"
        },
        "itemCategory": {
          "type": "text",
          "fields": {
            "keyword": {
              "type": "keyword",
              "ignore_above": 256
            }
          }
        },
        "projectIdentifier": {
          "type": "text"
        },
        "projectName": {
          "type": "text",
          "fields": {
            "raw": {
              "type": "keyword",
              "eager_global_ordinals": true
            }
          },
          "norms": false,
          "analyzer": "LowerCaseAnalyzer"
        },
        "projectNameOriginal": {
          "type": "text"
        },
        "initiator": {
          "type": "text",
          "fields": {
            "pattern": {
              "type": "text",
              "index_options": "offsets",
              "analyzer": "contentAnalyzer"
            },
            "raw": {
              "type": "keyword",
              "eager_global_ordinals": true
            }
          },
          "index_options": "offsets",
          "analyzer": "unstemmedFullTextAnalyzer"
        },
        "actionDate": {
          "type": "date",
          "format": "epoch_second"
        },
          "norms": false,
          "analyzer": "LowerCaseAnalyzer"
        }
      }
    }
  }
}

Christian_Dahlqvist · January 20, 2026, 5:38pm

Have you tried profiling a few queries that are slow?

If so, what was the result?

Can you show a sample query that is slow exactly as it is sent to Elasticsearch?

Topic		Replies	Views
Timeout for facets Elasticsearch	3	466	July 6, 2017
Monthly search timeout Elasticsearch	2	524	January 9, 2017
ES Timeout issues? Elasticsearch	7	1401	July 6, 2017
Elastic search performance Elasticsearch	6	844	December 29, 2016
Performance optimisation for Time based indexes Elasticsearch	1	358	September 23, 2019

Elasticsearch query timeouts on data stream (50M–170M docs, facets + NOT queries)

Backing index examples

Rollover setup

Query pattern

Related topics