Small Scale elasticsearch implementation

Erica · July 1, 2014, 2:40pm

I am currently using logstash and elasticsearch to parse just one log file
(it would not be uncommon for the file to be 1gb). A 15mb file is taking 2
minutes to parse out with this configuration I have posted below (I have
also tried using no filter, which takes approximately 1 minute to parse). I
have a lot more grok patterns and filtering that I want to implement as
well, which have caused the amount of time it takes to parse to jump to 25
minutes. Aside from changing the grok patterns and such, is there a way to
might things faster? Even for the more basic configuration, if you scale
that 2 minutes to a 1gb file, the amount of time it takes to parse is
extremely high. I am running both logstash and elasticsearch on a machine
with 4gb of RAM and AMD athlon II Dualcore m300 2.00GHz processor.

I am looking to keep things on a small scale implementation, as I only want
to read 1 log file every so often for troubleshooting purposes. Any help
with making logstash/elasticsearch quicker would be much appreciated.

input {

file {

path => "C:\Users\Attendee\Desktop\Log

Files\2013-10-10\nexus20-ANALYTICS_8080.log.2013-10-10"

        start_position => "beginning"

       

        #if the log line doesn't have a timestamp at the beginning,

this will merge the line with the previous line (which would have a
timestamp)

        codec => multiline {

  pattern => "^%{TIMESTAMP_ISO8601},"

  negate => true

  what => previous

}

}

filter {

grok {

match =>

        [

                    #default

                    "message", "%{TIMESTAMP_ISO8601:time}

[(%{WORD:soapService},|,)m=(%{WORD:m},|,)x=(%{GREEDYDATA:x},|,)c=(%{DATA:c},|,)i=
(%{GREEDYDATA:i}|)]
%{WORD:loglevel}\s*%{DATA:service}\s*-\s*%{GREEDYDATA:description}"

}

output {

elasticsearch {

        host => localhost 

        embedded => true

}

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/491b0a16-7159-4b20-a354-ab118d1766a1%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

warkolm · July 1, 2014, 10:26pm

This doesn't appear to be an ES specific issue, but I can see you've cross
posted this to the LS list so I'll reply there

Regards,
Mark Walkom

Infrastructure Engineer
Campaign Monitor
email: markw@campaignmonitor.com
web: www.campaignmonitor.com

On 2 July 2014 00:40, Erica areckanp@gmail.com wrote:

I am currently using logstash and elasticsearch to parse just one log file
(it would not be uncommon for the file to be 1gb). A 15mb file is taking 2
minutes to parse out with this configuration I have posted below (I have
also tried using no filter, which takes approximately 1 minute to parse). I
have a lot more grok patterns and filtering that I want to implement as
well, which have caused the amount of time it takes to parse to jump to 25
minutes. Aside from changing the grok patterns and such, is there a way to
might things faster? Even for the more basic configuration, if you scale
that 2 minutes to a 1gb file, the amount of time it takes to parse is
extremely high. I am running both logstash and elasticsearch on a machine
with 4gb of RAM and AMD athlon II Dualcore m300 2.00GHz processor.

I am looking to keep things on a small scale implementation, as I only
want to read 1 log file every so often for troubleshooting purposes. Any
help with making logstash/elasticsearch quicker would be much appreciated.

input {

file {
path => "C:\Users\Attendee\Desktop\Log Files\2013-10-10\nexus20-
ANALYTICS_8080.log.2013-10-10"
        start_position => "beginning"



        #if the log line doesn't have a timestamp at the beginning,
this will merge the line with the previous line (which would have a
timestamp)
        codec => multiline {

  pattern => "^%{TIMESTAMP_ISO8601},"

  negate => true

  what => previous

}
}

}

filter {

grok {
match =>

        [

                    #default

                    "message", "%{TIMESTAMP_ISO8601:time}
[(%{WORD:soapService},|,)m=(%{WORD:m},|,)x=(%{GREEDYDATA:x},|,)c=(%{DATA:c},|,)i=
(%{GREEDYDATA:i}|)] %{WORD:loglevel}\s*%{DATA:
service}\s*-\s*%{GREEDYDATA:description}"
        ]
}

}

output {

elasticsearch {
        host => localhost

        embedded => true
}

}

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/491b0a16-7159-4b20-a354-ab118d1766a1%40googlegroups.com
https://groups.google.com/d/msgid/elasticsearch/491b0a16-7159-4b20-a354-ab118d1766a1%40googlegroups.com?utm_medium=email&utm_source=footer
.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAEM624aOZWyp4TdGVjJ9O7Y5KzKbuN%2BULC10afpcVwWC8jmScw%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

Topic		Replies	Views
Speed up processing of logs Logstash	7	6425	April 26, 2017
ELK - Time taking to parse Elasticsearch	18	1349	August 24, 2017
Bottleneck while inputting data into the elasticsearch Logstash	7	3295	December 29, 2016
Logstash and elasticsearch performing Logstash	12	1078	January 17, 2017
Slow Data loading to elasticsearch Logstash	15	5227	July 13, 2017

Small Scale elasticsearch implementation

Related topics