The "Realtime" section states that a GET will implicitly execute a refresh before returning the data (except when disabled with ?realtime=false).
The "Refresh" section down on the same page states that I can use ?refresh=true to enforce a refresh before the GET is actually executed, but I should use this with care because of performance reasons.
Those two sections seem contradicting. What is the default - to refresh or not to refresh? Or maybe I misunderstood something?
Technically there are two kinds of refresh: INTERNAL and EXTERNAL. An INTERNAL refresh is much faster and cheaper than an EXTERNAL. If the realTime=true , the GET API might issue an INTERNAL refresh if needed; and if refresh=true, the GET API might issue an EXTERNAL refresh.
Thanks. Seems I can live with the defaults, although I am still a bit confused about the differences about an "internal" vs "external" refresh. If the internal refresh is so much faster - why not ALWAYS use this kind of refresh?
The EXTERNAL refresh is more expensive than the INTERNAL because of caching and warming up. The GET requests don't really need caching or warming, while the search requests need them. That's why we have two kinds of the refresh.
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.