ELK Machine capacity for 10000 events per sec

(Manjunath) #1

I have close to 1000 log files which produces 10000 events per sec, out of which i have to extract 1000 events per sec and parse it to elasticsearch.
Currently i have a machine (bare metal) which has 32 CPU cores, 47 gig RAM.
What would be the better machine capacity or number of machines for ELK to work ?

Please advice.

(Mark Walkom) #2

Is your current machine not coping?
It should be able to cope with that load.

(Manjunath) #3

Well, my groks are taking too much time to process. Thus resulting in a huge latency of processing files.
I have around 5 logstash config files, and around 1000 files passing to it with /*.log.
Each config file has 4 to 6 grok filters.
Well i really spent lot of time in analysing this without any luck.
Looks like ES is working fine without much load.

(Mark Walkom) #4

Then you need to add more LS workers or instances :slight_smile:

(Manjunath) #5

Well i added that too, currently i have 28 workers running. Still resulting in latency.
Looks like groks are eating up much CPU.

(Mark Walkom) #6

Maybe we can help tune your grok patterns then?

(Manjunath) #7

That would be really Great :smile:.

I am sending the 1st config groks.

             "message", "%{TIMESTAMP_ISO8601:log_timestamp}%{DATA}Site %{INT:id}%{DATA}accepted (?<skip>[0-9A-F]{24})(?<sequence_num_hex>[0-9A-F]{8})%{DATA}",
             "message", "%{TIMESTAMP_ISO8601:log_timestamp}%{DATA}Site %{INT}%{DATA}msgid: %{INT:msg_id}%{DATA}track_id: %{INT:track_id}%{DATA}name: %{USERNAME:name}, status: %{WORD:status}",
             "message", "%{TIMESTAMP_ISO8601:log_timestamp}%{DATA}Site %{INT:site}%{DATA}send (?<skip>[0-9A-F]{24})(?<sequence_num_hex>[0-9A-F]{8})%{DATA}",
             "message", "%{TIMESTAMP_ISO8601:log_timestamp}%{DATA}Site %{INT:site}%{DATA}msgid %{INT:msg_id}%{DATA}sdu: %{INT} %{INT} %{INT} %{INT:num}"

(Manjunath) #8

does the groks looks fine, its eating up lot of CPU.

(Mark Walkom) #9

I'd try creating custom patterns for the regex stuff you have in there, and then drop them using mutate (if that is what you are doing).

(system) #10