Searching through every document in ES?

bensalerno · October 8, 2021, 9:38pm

My cluster has roughly 300,000 documents in it. Right now, I'm only able to query 10,000 at a time. Is there a way to query through all documents? Or is this bad practice?

Many of search queries are returning 0 results because they do not exist within the first 10,000 results. What is my best option here?

Also, is there a way to return the total number of documents in the cluster? I know it's about 300,000 but I really don't know.

stephenb · October 8, 2021, 9:55pm

Hi @bensalerno

There are a few questions there lets start to sort them out.

First for the Total Number of documents

GET /_cluster/stats

Statistics on each Index

GET _cat/indices/?v

You can search across billions or trillions of documents in Elasticsearch but results sets may be limited, so I am a bit unclear what you are asking.. certainly you can search across 300K documents.

And you can "page" through many results if needed... see here

And Aggregations have sometimes have some limitations to the number of documents that are used in calculating the aggregations but I don't think you are asking that.

So let's take a look, Can you show us what your search query looks like? are you using the Query DSL or Discover? and perhaps we can help

bensalerno · October 8, 2021, 10:36pm

I am just doing a basic match query. Return docs with field Process that match. However, it only ever returns 10,000 results, never more. And many results show up empty

bensalerno · October 8, 2021, 10:39pm

When I try to change the size to return over 10,000 it tells me result window is too large.

stephenb · October 8, 2021, 10:46pm

Can you show your actual query and some of the empty results?

If you want more than 10,000 you will use the paging which I referenced above.

bensalerno · October 8, 2021, 11:10pm

I don't need more than 10,000 results. If I search for an item, and it does not return, does that mean it does not exist in the index? Does each query search through the entire index?

dadoonet · October 8, 2021, 11:12pm

Add the track_total_hits option to your search.

stephenb · October 8, 2021, 11:13pm

That is correct assuming your query is correctly formed And you are searching across an index or an index pattern that represents the data that you want to search against.

Yes assuming you have not applied filter like a time filter or a range filter or something that limits the search scope.

system · November 5, 2021, 11:14pm

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Get all documents from an index Elasticsearch	10	107522	June 21, 2017
How to retrieve all records from index Elasticsearch	5	3392	December 13, 2018
Saving Results from Millions of Documents of varying sizes with Python Elasticsearch Client Elasticsearch	1	905	April 26, 2017
Result window is too large, from + size must be less than or equal to: [10000] but was [11001] Elasticsearch	5	15497	July 5, 2017
Way to get all documents more than 10,000,000 Elasticsearch	2	451	May 9, 2022

Searching through every document in ES?

Related topics