Our system currently uses other NoSQL data store to manage its data.
It currently supports 180,000 inserts per second (we never update), each document is 1KB and is stored in the system for 180 days yielding 2.6PB of raw data. The system continuously analyzes the data and allows the users to browse analysis resultss and to randomely access the stored documents (by id).
We have a new requirement to support advanced search capabilities over a subset of fields (20 out of 280).
We are considering using Elasticsearch in one of two ways:
- As a search engine only (with _source disabled) - we index the 20 fields and when the user performs a search we retrieve the ids from elasticsearch and then use them to query the actual documents from a document store (e.g. - cassandre).
- As both search engine and document store (with _source enabled)
What would be the better approach considering the high insert rate ?