Elastic pressure


(Weiwei Wang) #1

i started four elastic search on 4 nodes(101,102,103,104,4-shard,1-
replica), and i wrote a webapp to transform the results for my product
need(which is run under tomcat and adds little latency under my
tests).

our QA ran pressure yesterday using jmeter, the configuration:
2 pressure node on (101,102) each starts 4096 threads to send request
to my webapp(on 104) and my webapp use TransportClient(set sniff
=true, and i only add 101 to it) to connect to ES

the result shows:

  1. 101 is too busy(100% cpu) to respond any connection, i use
    elasticsearch-head to monitor the status of ES and found if i connect
    to 102:9200, it shows 102,103,104 are forming a cluster and 101 is
    lost(however i checked 101 is still runing es but with 100% cup load)

  2. if i connect es-head to 103:9200 or 104:9200, it shows the cluster
    has only two nodes:103,104 and 101,102 is lost

  3. if i connect es-head to 101:9200, it shows not connected

  4. i killed jmeter nodes and checked 101 with top command, the result
    is very weird, the cpu load is still above 99% and after more than ten
    minutes it starts to drop down. After the cpu load has dropped down,
    es-head still can not connect to 101 and 101 is still not return back
    to the cluster


(system) #2