The crux of my question is this: say I have my refresh_interval set to 60s and most of my indexing requests do not include ?refresh=true. Then one single request comes through which does have refresh set to true. Does that cause everything to be refreshed?
In other words, is there any point in having most indexing operations not set refresh if there are regularly requests which do? Say, one every second, does that invalidate my refresh_interval of 60s?
Thanks @nhat, so is the mixing the two is a bad idea? Since, if I had several thousand request without refresh set and then I make one with it set (within the period of my refresh_interval), the performance of that will request will incur the cost of the refresh wait time for all other non-refreshed requests. The aggregate wait time could be smaller but from the perspective of that request, it's a longer wait, correct?
@nhat Sure, I have use cases where changes to a document need to be immediately reflected in search. For example, in our web app, users can take actions which add documents that need to be immediately reflected via search. Currently, because of that use case, all our requests are setting ?refresh=true. However, we have many use cases where this would be optional where we're doing bulk loading of documents in the background throughout the day to the same indexes that don't need to be reflected immediately and these processes are the "heavy lifters" in our system. So, I'd hate to mix the two and impact a web app request due to the fact that thousands of documents were loaded behind the scenes. It'd be unpredictable when and how much impact that might have on any given request requiring ?refresh=true.
@Thomas_Doman Setting refresh=true for the user requests makes sense in your use case. Do the newly indexed documents have to be visible in a search or can it be a realtime get instead?
@nhat Ah good point, but yes, the newly indexed documents have to be visible in a search. The pages are built based upon the same search both before and after the new documents are added. That is, after the new documents are added, the page is refreshed to reflect that change.
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.