How to overcome the two-billion limitation on the number of Elasticsearch records?

As explained in the below quoted post on StackOverflow, Elasticsearch has a limit of two billion documents.

Yes there is limit to the number of docs per shard of 2 billion, which is a hard lucene limit.

There is a maximum number of documents you can have in a single Lucene index. As of LUCENE-5843, the limit is 2,147,483,519 (= Integer.MAX_VALUE - 128) documents.

You should consider scaling horizontally.

However, the Elastic official website contains a success story about Rabobank: Enhancing the Online Banking Experience with Elasticsearch mentioning a dataset of over 23 billion transactions.

Not only is Rabobank searching faster than ever, they’re searching through more data than ever. With over 23 billion transactions spanning 80TB of data, Rabobank sees upwards of 200 events per second — over 10 million per day. And each query can span thousands of accounts, with corporate customers having over 5,000 accounts that they can now query at once. And being able to do all of this without adding any extra operations to their costly mainframes has helped save them millions of euros per year. Today, all front-end applications use Elasticsearch for search or for aggregating payment and saving transactions.

We are investigating the feasibility of the initiative using Elasticsearch as a secondary data source, side-by-side, synchronizing to the main SQL database. The scenario is OLTP (on-line transaction processing).

Our Question:

Considering the success story mentioned above, how can we overcome the two-billion limitation on the number of Elasticsearch records?

We highly appreciate any hints and suggestions.

Hello,

The limit of 2 billion is per shard, if you have an index with 2 primary shards for example, each shard can have up to 2 billion documents, which in total will be up to 4 billion documents.

Each Elasticsearch indice has one or more shards and Elastic recommends that you aim to have up to 200 Millions documents per shard or an average size of 50 GB.

This documentation explain a little more about how to size your shards.

1 Like

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.