About frequently index writing in elasticsearch cluster

We have a elasticsearch cluster for log collector system with 10
nodes(16Cores/32GB mem ). Installed one ES instance per machine with 16GB
jvm heap. The half of 10 nodes are primary, and other nodes is slave. We
store the last 7 days log into ES( one day one index) with 1TB index per
day. One index have 20 shards.
About the frequently writing system we have the below challenge:

  1. big jvm heap challenge: we use CMS GC with 16GB jvm heap.
  2. frequently index writing(2000 indexing requests per secend per node) and
    low search request(10~20 search requests per secend).
  3. the end user want search the log from the ES ASAP( log transport flow:
    the user app write the log to local disk -> the collector read the log and
    send the remote ES cluster -> the end user search log from front page of
    ES).
  4. the search will timeout when the ES merge the lucene segment, and the
    index write speed will to be slow. This time,the CPU load will uprise, our
    cluter can hardly provide search and index write.(Occur 1~2 times every day)

Anybody can give us some suggestion for our system? Especially the lucene
segment merge.Thank you very much!

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

  1. the search will timeout when the ES merge the lucene segment, and
    the index write speed will to be slow. This time,the CPU load will
    uprise, our cluter can hardly provide search and index write.(Occur
    1~2 times every day)

Anybody can give us some suggestion for our system? Especially the
lucene segment merge.Thank you very much!

Merges can be very heavy IO wise. They can slow things down, which can
cause other index or search requests to build up, which can result in a
"thread explosion".

Advice:

  1. fastest disks possible - SSDs if you can
  2. Use fixed thread pools, rather than the default "cached", to avoid
    a slow process causing the generation of thousands of threads
    Elasticsearch Platform — Find real-time answers at scale | Elastic
  3. Configure merge throttling to slow down merges. This will prevent
    merges from swamping your I/O
    Elasticsearch Platform — Find real-time answers at scale | Elastic

clint

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.