Data stream = real time search?

I dont understand what feature Data Stream (Data streams | Elasticsearch Reference [master] | Elastic) is providing.

On first reading, DataStream is dynamic index alias.

Is it providing in-memory segments for real-time indexing/search?

It's a formalisation of a strategy for dealing with endless firehoses of information.
New data is likely to be more interesting and is held on fast servers that build indices on disk and serve queries. Older indices are moved off onto less beefy servers that just serve queries Even older data is held in backup storage and queried less frequently.

All these policies govern data ageing and the automatic movement between different classes of data store.

No, unless you consider that giving a beefy server lots of RAM will provide plenty of file system cache for holding those immutable segment files Lucene creates in memory.

Ok got it! thanks you.

This is more a "time series data governance" feature than a "data stream" (in analogy with Kafka/Spark stream).

I know Vespa use in-memory storage to provide super-fast indexing, would be great to have same thing with elasticsearch.

1 Like

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.