Hi,
The timeout setting is best-efforts and relies on timer checks woven into the various “tight loops” in the search code.
One of the heavy loops that is missing timer checks is the one in the significant_text aggregation where it checks the background frequency of all the terms found in matches. For this reason (and to limit memory usage) it is recommended that the “sampler” aggregation is used in conjunction with the significant_text aggregation to limit the number of matching docs analysed and therefore the number of matching terms. Another way to limit the number of background frequency checks is to increase the value of “shard_min_doc_count” eg to 3 or 4 to only consider terms seen in this number of docs.
2 Likes