Hi,
We plan to use ES to save the logs and search the logs.
800G logs or so will be produced per day.
Currently, we are running the performance test for long time.
We used 2 machines, 46 cores and 32G memory. The configuration is 2
shards, 1 replica, ES heap size is set to 20G. And only analyze the log
messages(omit norms and term frequency), all of other 5 fields are not
analyzed. And field cache type is set to soft, the max size is set to
10000, _source is set to true, _all is set to false.
We use Jmeter to simulate 600 thread to send the write request to the
ES cluster(write one log/per time) and at the same time, we use 10 thread
to query. The facet query are on all the fields.
Until now, we already indexed about 0.35 billion documents with
360G(720G)
During the test, we met the following the problems:
1. The cluster is easy to become yellow. Through ES head, we only can
see the node itself, the other node can't be seen.
After changing the multicast to unicast and restart the ES, the
problem is fixed, but after run 3-4 hours, the problem occurs again.
In fact, our system needs to run 724. This problem is really
unacceptable.
Is there any other way except restart ES which can fix this problem?
In addition, even if we restart the ES to fix the problem, it took
about 2 hours to fully recovery the cluster to green.
2. sometimes, GC takes more than 1 hour. during these period, the ES
has no response for any request.
Is there any way to avoid so long time GC??
3. The search performance is much more important than the indexing
for our system.
We plan to create a new index per month and use 5 ES servers(5 shards
+1 replica) in our product environment to handle 800G data per day.
Is this configure ok or not?? Is there any suggestion for the index
design and configuration?
I'm almost crazy by these problem. Can anyone give me a hand.
Thank you very much!