I'm french and i m a very newbie with elasticsearch.
Elasticsearch version imposed by security team : 7.10.2
I create a cluster like this with dedicate nodes:
2 master node
1 master only eligible node
1 coordinating only node
6 data node
2 data WARM node
2 data COLD node
6 ingest nodes
I have to ingest 130 Go of many logs in one time per day of one application (application is load balanced on 32 servers) in production.
n reality its the log of the day before. I retrieve an archive of log the day after.
I ingest the log xxxxxx.log-2023-04-16 in ELK the day 17th April 2024
For example :
the logs are created the 16th April
An archive is done in the night of the 16th April.
I have to ingest the log of the 16th April the 17th April, the day after also
The first day was April, the 14th and the indexing was very very slowly.
Also I need to tune correctly
Cluster side :
Shard : 1 primary, 1 replica
Refresh_interval : 30 s
Filebeat side :
9 input log type
output elasticsearch hosts :
--> i have put all the ingest nodes
loadbalance : true
--> is it the good practice ?
bulk_max_size : 8192
--> can i increase bulk_max_size ?
worker : 8
--> can i increase ? until what ?
queue.mem.events :
--> how calculate the good number ?
Could you help me to correctly configure my filebeat please ?
I would start by increasing the number of primary shards to 4 or 5 and see if that helps.
Please note that version is EOL and no longer supported, you should be looking to upgrade as a matter of urgency. It seems odd that the security team would impose an EOL version with known bugs and security issues.
Hi thanks for the answer
I already have an index template attached to an.ilm.policy.
I modify the index template by adding the index.number.shards and index.number.replicas then i create an index but it was ko .
I will retry monday but maybe this we it will work this weekend
Elk is up :+). Surprise for monday
I already said for the 7.10.2 EOL but i m in state administration and the decision takes times.
Ok my big index was creating with 4 primary shards
It seems to be better when i watch the indexing time API, if my request was correct.
But i m afraid of 2 things with 4 shards :
the disk volume will increase no ?
i would like use shrink api in WARM node to reduce the volume, but first a copy of old index is creating, then i will have 2 BIG indexes. so it requires 2 x the volume of the old index.
I will test
The old indices when shrinking is deleted automatically ?
I would like to shrink only one index and reallocate this one on dedicated WARM node with custom attiribute, but for the moment i can't apply index template
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.