What data is being queried when there is a continuous stream of input data?

Hi, sorry if this is a basic question, but here goes:
Lets say I have continuous stream of data being pushed into an elasticsearch index. When I'm sending a query, to what data does the query apply to? If relevant data is being added into the index while the query is being proceesed, will it be included in the final results?
I would love to see some docs on the subject, on how the low level stuff works regarding this question.
Thanks!

Hey,

the topic to read about is called refresh. By default there is a process in the background that runs once per second (this duration is called the refresh interval) which ensures that data indexed up to that point is made available for search.

See https://www.elastic.co/guide/en/elasticsearch/reference/6.6/docs-refresh.html (you might also be interested how lucene stores its data, and how those are made available for search)

Couple of optional links

https://www.elastic.co/guide/en/elasticsearch/reference/6.6/indices-flush.html
https://www.elastic.co/guide/en/elasticsearch/reference/6.6/index-modules-translog.html
https://www.elastic.co/guide/en/elasticsearch/reference/6.6/index-modules-merge.html
https://www.elastic.co/guide/en/elasticsearch/guide/current/inside-a-shard.html

--Alex

2 Likes

Thanks Alex! That is realy helpful.
Cheers

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.