elasticsearch-hadoop document link says
es.batch.write.refresh (default true)
Whether to invoke an index refresh or not after a bulk update has been completed. Note this is called only after the entire write (meaning multiple bulk updates) have been executed.
However, the official document of ES about refresh says that
The Index, Update, Delete, and Bulk APIs support setting refresh to control when changes made by this request are made visible to search.
This (set refresh=true of the bulk request) should ONLY be done after careful thought and verification that it does not lead to poor performance, both from an indexing and a search standpoint.
true creates less efficient indexes constructs (tiny segments) that must later be merged into more efficient index constructs (larger segments). Meaning that the cost of true is paid at index time to create the tiny segment, at search time to search the tiny segment, and at merge time to make the larger segments.
The confliction between their documents is confusing. Can anybody help explain more about the reason that elastic-hadoop use true as the default value?