How to overcome the two-billion limitation on the number of Elasticsearch records?

Mike_Z · July 25, 2023, 10:25pm

As explained in the below quoted post on StackOverflow, Elasticsearch has a limit of two billion documents.

Yes there is limit to the number of docs per shard of 2 billion, which is a hard lucene limit.

There is a maximum number of documents you can have in a single Lucene index. As of LUCENE-5843, the limit is 2,147,483,519 (= Integer.MAX_VALUE - 128) documents.

You should consider scaling horizontally.

However, the Elastic official website contains a success story about Rabobank: Enhancing the Online Banking Experience with Elasticsearch mentioning a dataset of over 23 billion transactions.

Not only is Rabobank searching faster than ever, they’re searching through more data than ever. With over 23 billion transactions spanning 80TB of data, Rabobank sees upwards of 200 events per second — over 10 million per day. And each query can span thousands of accounts, with corporate customers having over 5,000 accounts that they can now query at once. And being able to do all of this without adding any extra operations to their costly mainframes has helped save them millions of euros per year. Today, all front-end applications use Elasticsearch for search or for aggregating payment and saving transactions.

We are investigating the feasibility of the initiative using Elasticsearch as a secondary data source, side-by-side, synchronizing to the main SQL database. The scenario is OLTP (on-line transaction processing).

Our Question:

Considering the success story mentioned above, how can we overcome the two-billion limitation on the number of Elasticsearch records?

We highly appreciate any hints and suggestions.

leandrojmp · July 25, 2023, 11:01pm

Hello,

The limit of 2 billion is per shard, if you have an index with 2 primary shards for example, each shard can have up to 2 billion documents, which in total will be up to 4 billion documents.

Each Elasticsearch indice has one or more shards and Elastic recommends that you aim to have up to 200 Millions documents per shard or an average size of 50 GB.

This documentation explain a little more about how to size your shards.

system · August 22, 2023, 11:02pm

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
I would like to know about the limit on the maximum number of documents per index Elasticsearch	4	367	August 25, 2023
The total indexation hits Elasticsearch	2	276	May 26, 2020
Is there a plan for lucene to expand max doc size per index? Elasticsearch	11	4308	July 5, 2017
What is the maximum number of documents an index can hold? Elasticsearch	3	7041	May 22, 2020
Shards fail to start up when they have > 2 billion documents Elasticsearch	2	4626	July 5, 2017

How to overcome the two-billion limitation on the number of Elasticsearch records?

Related topics