CPU spikes to more than 150% on running logstash instance

Hello,

I have two ubuntu instances in Amazon with the following configuration.

Memory - 62.9
Hard Disk - 197G
Hard Disk Available - 184G
ard Disk Type: SSD
CPU - 8 Core

The first instance has logstash setup. It reads Apache logs. There are 10 such logs - each with 140000000 (14 crore lines).
When logstash is started, the CPU spikes to more than 150%
If i start two instances , one instance shows 150% and the other more than 200%.
Have tried with 10 such instances (all toggling between 200-250%)

We have written a simple grok filter for reading the logs and then pushing them to Elastic search (set in the other instance)

Also, graphite displays around 3500 events processed per min. Is that appropriate?

Let me now hw the performance can be improved. Thanks for your help in advance.

Refer the logstash configuration file,

input {
file {
path => "/home/ubuntu/logs/*.txt"
start_position=>"beginning"
type => "tomcat_accessLogs"
sincedb_path => "/dev/null"
}
}

filter {
if [type] =="tomcat_accessLogs"{
metrics {
meter => "events"
add_tag => "metric"
}
grok {
match => {"message" =>"%{IP:ipaddress}%{SPACE}[%{HTTPDATE:datetime}]%{SPACE}%{WORD:method}%{SPACE}%{DATA:request}\s++%{NOTSPACE:protocol}%{SPACE}%{INT:statuscode}%{SPACE}%{INT:size}%{SPACE}(?%{NOTSPACE:queryString}|\s)%{SPACE}%{USER:user_name}%{GREEDYDATA:user-agent}"}
}
date {
locale => "en"
match => [ "datetime", "dd/MMM/yyyy:HH:mm:ss Z" ]
}
mutate{
remove_field => ["message","datetime"]
}
}
}

output {
elasticsearch
{
host => "x.x.x.x"
protocol => "http"
port => "9200"
cluster => "elkcluster"
}
graphite {
metrics => [ "events.rate_1m", "%{events.rate_1m}" ]
}
}

Regards,
Bhavani

1 instance is more then enough for the volume your doing.

Does the Logstash calm down after it processes the initial 140 Million lines? How many hits per minute is your site taking?

Also, what does the IO Wait look like when you do a top?