Merge/segment understanding

Thanks Binh.

I'm curious about this because we're benchmarking our bulk indexing. And
we've found out that the fastest bulk indexing strategy to be:

  • bulk indexing with 0 replica, no refresh, let ES do as little merge as
    possible
  • when indexing finished, optimize segments
  • replicates

Is there some readings about the details/internals of lucene? We've the
book Lucene in Action but it's mainly about core concepts and usage.

在 2014年3月28日星期五UTC+1下午8时32分46秒,Binh Ly写道:

The indexing buffer could also fill up which will flush to a segment. Also
the translog flush is not "exactly" deterministic, for example
"index.translog.interval" determines how often to check if the translog
needs to be flushed or not. Anyway, I wouldn't worry about it if I were
you. About the merge, I'd probably leave the defaults alone unless you are
absolutely sure changing them helps you. The more segments there are, the
more time it could take to do a merge.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/21034812-7c7e-4469-a3ad-7ceadde349e6%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.