I'm involved in a project where data is aggregated from a wide variety of datasources into an SQL Server instance. The aggregation is continuous (via a polling service), and involves large volumes of data (millions of records). A set of Elastic Search indices are populated from the SQL Server data with each index containing a graph of data from across several sql tables.. Elastic search is then used to provide fast searching of records (speed is of the essence here).
The mechanism for populating and maintaining the elastic search index is under review, and this is the reason for posting:
- What options are there for maintaining the elastic search index in this scenario?
- Is there anything considered as best-practice?
I realise this is somewhat of an open question, but I'm looking for suggestions or opinions on what would be worth investigating. I've read some other posts on the subject, but they suggest using "rivers" which I think have now been deprecated.