Huge logs - how Tuning filebeat

pepite · April 19, 2023, 7:04pm

Hi everybody,

I'm french and i m a very newbie with elasticsearch.

Elasticsearch version imposed by security team : 7.10.2

I create a cluster like this with dedicate nodes:

2 master node
1 master only eligible node
1 coordinating only node
6 data node
2 data WARM node
2 data COLD node
6 ingest nodes

I have to ingest 130 Go of many logs in one time per day of one application (application is load balanced on 32 servers) in production.

n reality its the log of the day before. I retrieve an archive of log the day after.

I ingest the log xxxxxx.log-2023-04-16 in ELK the day 17th April 2024
For example :

the logs are created the 16th April
An archive is done in the night of the 16th April.
I have to ingest the log of the 16th April the 17th April, the day after also

The first day was April, the 14th and the indexing was very very slowly.

Also I need to tune correctly

Cluster side :

Shard : 1 primary, 1 replica
Refresh_interval : 30 s

Filebeat side :


    9 input log type
    output elasticsearch hosts :
    --> i have put all the ingest nodes 
    loadbalance : true
 --> is it the good practice ?
    bulk_max_size : 8192
    --> can i increase bulk_max_size ?
    
worker : 8
    --> can i increase ? until what ?
    queue.mem.events :
    --> how calculate the good number ?

Could you help me to correctly configure my filebeat please ?

warkolm · April 20, 2023, 11:44pm

I would start by increasing the number of primary shards to 4 or 5 and see if that helps.

Please note that version is EOL and no longer supported, you should be looking to upgrade as a matter of urgency. It seems odd that the security team would impose an EOL version with known bugs and security issues.

pepite · April 21, 2023, 7:01pm

Hi thanks for the answer
I already have an index template attached to an.ilm.policy.
I modify the index template by adding the index.number.shards and index.number.replicas then i create an index but it was ko .
I will retry monday but maybe this we it will work this weekend
Elk is up :+). Surprise for monday

I already said for the 7.10.2 EOL but i m in state administration and the decision takes times.

An other idea ?
Thanks.

pepite · April 24, 2023, 8:03pm

Hi @warkolm

Ok my big index was creating with 4 primary shards

It seems to be better when i watch the indexing time API, if my request was correct.
But i m afraid of 2 things with 4 shards :

the disk volume will increase no ?
i would like use shrink api in WARM node to reduce the volume, but first a copy of old index is creating, then i will have 2 BIG indexes. so it requires 2 x the volume of the old index.

is it correct ?

warkolm · April 25, 2023, 10:31pm

Why's that? You aren't storing additional data are you?

Yes, that's a requirement for shrink.

pepite · April 28, 2023, 12:58pm

Hi,

No only my indices

I will test
The old indices when shrinking is deleted automatically ?

I would like to shrink only one index and reallocate this one on dedicated WARM node with custom attiribute, but for the moment i can't apply index template

system · May 26, 2023, 2:59pm

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Filebeat and Logstash tuning - How to align performance Beats filebeat	2	1816	April 22, 2022
Index tuning Elasticsearch	4	702	July 23, 2017
Suggestion improving filebeat performance Beats filebeat	3	1223	November 24, 2017
Filebeat 6.2 throughput and general performance Beats filebeat	7	4461	April 3, 2018
Performance weird stuff Elasticsearch	13	874	September 25, 2020

Huge logs - how Tuning filebeat

Related topics