CPU spike on indexing data into ElastiSearch

Hello everyone,
I am new to Elastic Search world and I am evaluating it for storing and
retrieval of my application logs, which is currently being stored in
postgres database.
For performance and disk space reasons, I plan to switch logs storage from
posgres to elastic search. I plan to store log information for various
applications as different indices in ES.

But when I implemented this tool, I noticed lots of cpu spike on my system
when it received the request for inserting logs data into ES.
I am using Linux and allocated ES a 2GB RAM. I am using REST service calls
to push logs data into ES. There are about 5000 records being inserted into
one index for every request into the application (all in JSON format).
Refresh interval is kept to 30 seconds in elasticsearch.yml file

Number of shards configured to 1 and replicas to 0. There are about 20
index created for different applications, which might eventually grow upto
50 (as it is per application) Also, advise, if it should be separate index
per application, or single index storing logs of all of the applications. I
expect about 3k to 5k log messages per any request in any application
Single node is being used for storing logs of multiple applications running
on same node under separate JVM
JDK: 1.7

ES_HEAP_SIZE=2g
Also configured, mlockall to true in elasticsearch.yml file.

I am clueless in identifying root cause of cpu spike.

Can anyone please suggest a way out in further troubleshooting and
determining the root cause. What should be the next steps to troubleshoot
the same.

ES version: 1.4.2

Please let me know, if you need any further information.
Appreciate your inputs and help!

Regards,
Sagar Shah

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/aa4b625b-4e5d-4f6a-b92f-9b9c23cb3e4f%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

That would be the indexing process that happens when you send the logs, as
ES needs to process and turn these into an inverted index for searching.

On 11 February 2015 at 07:23, Sagar Shah sagarshah1983@gmail.com wrote:

Hello everyone,
I am new to Elastic Search world and I am evaluating it for storing and
retrieval of my application logs, which is currently being stored in
postgres database.
For performance and disk space reasons, I plan to switch logs storage from
posgres to Elasticsearch. I plan to store log information for various
applications as different indices in ES.

But when I implemented this tool, I noticed lots of cpu spike on my system
when it received the request for inserting logs data into ES.
I am using Linux and allocated ES a 2GB RAM. I am using REST service calls
to push logs data into ES. There are about 5000 records being inserted into
one index for every request into the application (all in JSON format).
Refresh interval is kept to 30 seconds in elasticsearch.yml file

Number of shards configured to 1 and replicas to 0. There are about 20
index created for different applications, which might eventually grow upto
50 (as it is per application) Also, advise, if it should be separate index
per application, or single index storing logs of all of the applications. I
expect about 3k to 5k log messages per any request in any application
Single node is being used for storing logs of multiple applications
running on same node under separate JVM
JDK: 1.7

ES_HEAP_SIZE=2g
Also configured, mlockall to true in elasticsearch.yml file.

I am clueless in identifying root cause of cpu spike.

Can anyone please suggest a way out in further troubleshooting and
determining the root cause. What should be the next steps to troubleshoot
the same.

ES version: 1.4.2

Please let me know, if you need any further information.
Appreciate your inputs and help!

Regards,
Sagar Shah

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/aa4b625b-4e5d-4f6a-b92f-9b9c23cb3e4f%40googlegroups.com
https://groups.google.com/d/msgid/elasticsearch/aa4b625b-4e5d-4f6a-b92f-9b9c23cb3e4f%40googlegroups.com?utm_medium=email&utm_source=footer
.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAEYi1X9%3DVrKWjfqFLV9Wb1T7%2BZ_iA91bqhLDegAhEdnzx5MePQ%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

Thanks for your input Mark.

Can that be optimized? Does that run real time? I configured
index.refresh_interval to run every 30 seconds. Can I optimize indexing
process anyway.
I have observed that whenever I process 3-4 events in my application
causing about 5-6K records insertion in elasticseach, there's cpu spike.

I monitored heap as well, and its way below (200mb) size configured (2gb)
in ES.

I don't need to use any fancy feature of cluster or availability at this
point.
My whole objective (from previous post) is to ensure that I store logs
information and retrieve it using API and show in my web administration
application without causing any performance or cpu spike.

Please advise.
Thanks!

Regards,
Sagar Shah
On Wednesday, February 11, 2015 at 2:32:45 AM UTC+5:30, Mark Walkom wrote:

That would be the indexing process that happens when you send the logs, as
ES needs to process and turn these into an inverted index for searching.

On 11 February 2015 at 07:23, Sagar Shah <sagars...@gmail.com
<javascript:>> wrote:

Hello everyone,
I am new to Elastic Search world and I am evaluating it for storing and
retrieval of my application logs, which is currently being stored in
postgres database.
For performance and disk space reasons, I plan to switch logs storage
from posgres to Elasticsearch. I plan to store log information for various
applications as different indices in ES.

But when I implemented this tool, I noticed lots of cpu spike on my
system when it received the request for inserting logs data into ES.
I am using Linux and allocated ES a 2GB RAM. I am using REST service
calls to push logs data into ES. There are about 5000 records being
inserted into one index for every request into the application (all in JSON
format). Refresh interval is kept to 30 seconds in elasticsearch.yml file

Number of shards configured to 1 and replicas to 0. There are about 20
index created for different applications, which might eventually grow upto
50 (as it is per application) Also, advise, if it should be separate index
per application, or single index storing logs of all of the applications. I
expect about 3k to 5k log messages per any request in any application
Single node is being used for storing logs of multiple applications
running on same node under separate JVM
JDK: 1.7

ES_HEAP_SIZE=2g
Also configured, mlockall to true in elasticsearch.yml file.

I am clueless in identifying root cause of cpu spike.

Can anyone please suggest a way out in further troubleshooting and
determining the root cause. What should be the next steps to troubleshoot
the same.

ES version: 1.4.2

Please let me know, if you need any further information.
Appreciate your inputs and help!

Regards,
Sagar Shah

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearc...@googlegroups.com <javascript:>.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/aa4b625b-4e5d-4f6a-b92f-9b9c23cb3e4f%40googlegroups.com
https://groups.google.com/d/msgid/elasticsearch/aa4b625b-4e5d-4f6a-b92f-9b9c23cb3e4f%40googlegroups.com?utm_medium=email&utm_source=footer
.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/9b677166-bd29-4ef9-864a-480b5b09d565%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

You will never get around the spike on some level. Elasticsearch uses
resources at index time to ensure the searching is fast.

You may be able to reduce it a little by playing with your bulk size, ie
reducing it, or adding more CPUs

On 11 February 2015 at 08:23, Sagar Shah sagarshah1983@gmail.com wrote:

Thanks for your input Mark.

Can that be optimized? Does that run real time? I configured
index.refresh_interval to run every 30 seconds. Can I optimize indexing
process anyway.
I have observed that whenever I process 3-4 events in my application
causing about 5-6K records insertion in elasticseach, there's cpu spike.

I monitored heap as well, and its way below (200mb) size configured (2gb)
in ES.

I don't need to use any fancy feature of cluster or availability at this
point.
My whole objective (from previous post) is to ensure that I store logs
information and retrieve it using API and show in my web administration
application without causing any performance or cpu spike.

Please advise.
Thanks!

Regards,
Sagar Shah
On Wednesday, February 11, 2015 at 2:32:45 AM UTC+5:30, Mark Walkom wrote:

That would be the indexing process that happens when you send the logs,
as ES needs to process and turn these into an inverted index for searching.

On 11 February 2015 at 07:23, Sagar Shah sagars...@gmail.com wrote:

Hello everyone,
I am new to Elastic Search world and I am evaluating it for storing and
retrieval of my application logs, which is currently being stored in
postgres database.
For performance and disk space reasons, I plan to switch logs storage
from posgres to Elasticsearch. I plan to store log information for various
applications as different indices in ES.

But when I implemented this tool, I noticed lots of cpu spike on my
system when it received the request for inserting logs data into ES.
I am using Linux and allocated ES a 2GB RAM. I am using REST service
calls to push logs data into ES. There are about 5000 records being
inserted into one index for every request into the application (all in JSON
format). Refresh interval is kept to 30 seconds in elasticsearch.yml file

Number of shards configured to 1 and replicas to 0. There are about 20
index created for different applications, which might eventually grow upto
50 (as it is per application) Also, advise, if it should be separate index
per application, or single index storing logs of all of the applications. I
expect about 3k to 5k log messages per any request in any application
Single node is being used for storing logs of multiple applications
running on same node under separate JVM
JDK: 1.7

ES_HEAP_SIZE=2g
Also configured, mlockall to true in elasticsearch.yml file.

I am clueless in identifying root cause of cpu spike.

Can anyone please suggest a way out in further troubleshooting and
determining the root cause. What should be the next steps to troubleshoot
the same.

ES version: 1.4.2

Please let me know, if you need any further information.
Appreciate your inputs and help!

Regards,
Sagar Shah

--
You received this message because you are subscribed to the Google
Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send
an email to elasticsearc...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/
msgid/elasticsearch/aa4b625b-4e5d-4f6a-b92f-9b9c23cb3e4f%
40googlegroups.com
https://groups.google.com/d/msgid/elasticsearch/aa4b625b-4e5d-4f6a-b92f-9b9c23cb3e4f%40googlegroups.com?utm_medium=email&utm_source=footer
.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/9b677166-bd29-4ef9-864a-480b5b09d565%40googlegroups.com
https://groups.google.com/d/msgid/elasticsearch/9b677166-bd29-4ef9-864a-480b5b09d565%40googlegroups.com?utm_medium=email&utm_source=footer
.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAEYi1X94dLvjrV_2Vina9Z1uas7mmxWBeDdo32zHxet9qtDq9g%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

Thanks Mark for sharing your inputs, but I see CPU spiking upto 80% at
times.
Is it expected or desirable for few thousands records?

On Wed, Feb 11, 2015 at 3:37 AM, Mark Walkom markwalkom@gmail.com wrote:

You will never get around the spike on some level. Elasticsearch uses
resources at index time to ensure the searching is fast.

You may be able to reduce it a little by playing with your bulk size, ie
reducing it, or adding more CPUs

On 11 February 2015 at 08:23, Sagar Shah sagarshah1983@gmail.com wrote:

Thanks for your input Mark.

Can that be optimized? Does that run real time? I configured
index.refresh_interval to run every 30 seconds. Can I optimize indexing
process anyway.
I have observed that whenever I process 3-4 events in my application
causing about 5-6K records insertion in elasticseach, there's cpu spike.

I monitored heap as well, and its way below (200mb) size configured (2gb)
in ES.

I don't need to use any fancy feature of cluster or availability at this
point.
My whole objective (from previous post) is to ensure that I store logs
information and retrieve it using API and show in my web administration
application without causing any performance or cpu spike.

Please advise.
Thanks!

Regards,
Sagar Shah
On Wednesday, February 11, 2015 at 2:32:45 AM UTC+5:30, Mark Walkom wrote:

That would be the indexing process that happens when you send the logs,
as ES needs to process and turn these into an inverted index for searching.

On 11 February 2015 at 07:23, Sagar Shah sagars...@gmail.com wrote:

Hello everyone,
I am new to Elastic Search world and I am evaluating it for storing and
retrieval of my application logs, which is currently being stored in
postgres database.
For performance and disk space reasons, I plan to switch logs storage
from posgres to Elasticsearch. I plan to store log information for various
applications as different indices in ES.

But when I implemented this tool, I noticed lots of cpu spike on my
system when it received the request for inserting logs data into ES.
I am using Linux and allocated ES a 2GB RAM. I am using REST service
calls to push logs data into ES. There are about 5000 records being
inserted into one index for every request into the application (all in JSON
format). Refresh interval is kept to 30 seconds in elasticsearch.yml file

Number of shards configured to 1 and replicas to 0. There are about 20
index created for different applications, which might eventually grow upto
50 (as it is per application) Also, advise, if it should be separate index
per application, or single index storing logs of all of the applications. I
expect about 3k to 5k log messages per any request in any application
Single node is being used for storing logs of multiple applications
running on same node under separate JVM
JDK: 1.7

ES_HEAP_SIZE=2g
Also configured, mlockall to true in elasticsearch.yml file.

I am clueless in identifying root cause of cpu spike.

Can anyone please suggest a way out in further troubleshooting and
determining the root cause. What should be the next steps to troubleshoot
the same.

ES version: 1.4.2

Please let me know, if you need any further information.
Appreciate your inputs and help!

Regards,
Sagar Shah

--
You received this message because you are subscribed to the Google
Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send
an email to elasticsearc...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/
msgid/elasticsearch/aa4b625b-4e5d-4f6a-b92f-9b9c23cb3e4f%
40googlegroups.com
https://groups.google.com/d/msgid/elasticsearch/aa4b625b-4e5d-4f6a-b92f-9b9c23cb3e4f%40googlegroups.com?utm_medium=email&utm_source=footer
.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/9b677166-bd29-4ef9-864a-480b5b09d565%40googlegroups.com
https://groups.google.com/d/msgid/elasticsearch/9b677166-bd29-4ef9-864a-480b5b09d565%40googlegroups.com?utm_medium=email&utm_source=footer
.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to a topic in the
Google Groups "elasticsearch" group.
To unsubscribe from this topic, visit
https://groups.google.com/d/topic/elasticsearch/bcmrhRmh_r4/unsubscribe.
To unsubscribe from this group and all its topics, send an email to
elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/CAEYi1X94dLvjrV_2Vina9Z1uas7mmxWBeDdo32zHxet9qtDq9g%40mail.gmail.com
https://groups.google.com/d/msgid/elasticsearch/CAEYi1X94dLvjrV_2Vina9Z1uas7mmxWBeDdo32zHxet9qtDq9g%40mail.gmail.com?utm_medium=email&utm_source=footer
.

For more options, visit https://groups.google.com/d/optout.

--

Regards,
Sagar Shah

Too many people think more of security instead of opportunity. They seem
more afraid of life than death!!!

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CANZnv5iCnbF7yhYTA7mWB1sB2%2Bt45RvGkrmf91tqjkRArLhncg%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

You cannot get around this, it is expected, it is desirable.

On 11 February 2015 at 17:22, Sagar Shah sagarshah1983@gmail.com wrote:

Thanks Mark for sharing your inputs, but I see CPU spiking upto 80% at
times.
Is it expected or desirable for few thousands records?

On Wed, Feb 11, 2015 at 3:37 AM, Mark Walkom markwalkom@gmail.com wrote:

You will never get around the spike on some level. Elasticsearch uses
resources at index time to ensure the searching is fast.

You may be able to reduce it a little by playing with your bulk size, ie
reducing it, or adding more CPUs

On 11 February 2015 at 08:23, Sagar Shah sagarshah1983@gmail.com wrote:

Thanks for your input Mark.

Can that be optimized? Does that run real time? I configured
index.refresh_interval to run every 30 seconds. Can I optimize indexing
process anyway.
I have observed that whenever I process 3-4 events in my application
causing about 5-6K records insertion in elasticseach, there's cpu spike.

I monitored heap as well, and its way below (200mb) size configured
(2gb) in ES.

I don't need to use any fancy feature of cluster or availability at this
point.
My whole objective (from previous post) is to ensure that I store logs
information and retrieve it using API and show in my web administration
application without causing any performance or cpu spike.

Please advise.
Thanks!

Regards,
Sagar Shah
On Wednesday, February 11, 2015 at 2:32:45 AM UTC+5:30, Mark Walkom
wrote:

That would be the indexing process that happens when you send the logs,
as ES needs to process and turn these into an inverted index for searching.

On 11 February 2015 at 07:23, Sagar Shah sagars...@gmail.com wrote:

Hello everyone,
I am new to Elastic Search world and I am evaluating it for storing
and retrieval of my application logs, which is currently being stored in
postgres database.
For performance and disk space reasons, I plan to switch logs storage
from posgres to Elasticsearch. I plan to store log information for various
applications as different indices in ES.

But when I implemented this tool, I noticed lots of cpu spike on my
system when it received the request for inserting logs data into ES.
I am using Linux and allocated ES a 2GB RAM. I am using REST service
calls to push logs data into ES. There are about 5000 records being
inserted into one index for every request into the application (all in JSON
format). Refresh interval is kept to 30 seconds in elasticsearch.yml file

Number of shards configured to 1 and replicas to 0. There are about 20
index created for different applications, which might eventually grow upto
50 (as it is per application) Also, advise, if it should be separate index
per application, or single index storing logs of all of the applications. I
expect about 3k to 5k log messages per any request in any application
Single node is being used for storing logs of multiple applications
running on same node under separate JVM
JDK: 1.7

ES_HEAP_SIZE=2g
Also configured, mlockall to true in elasticsearch.yml file.

I am clueless in identifying root cause of cpu spike.

Can anyone please suggest a way out in further troubleshooting and
determining the root cause. What should be the next steps to troubleshoot
the same.

ES version: 1.4.2

Please let me know, if you need any further information.
Appreciate your inputs and help!

Regards,
Sagar Shah

--
You received this message because you are subscribed to the Google
Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send
an email to elasticsearc...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/
msgid/elasticsearch/aa4b625b-4e5d-4f6a-b92f-9b9c23cb3e4f%
40googlegroups.com
https://groups.google.com/d/msgid/elasticsearch/aa4b625b-4e5d-4f6a-b92f-9b9c23cb3e4f%40googlegroups.com?utm_medium=email&utm_source=footer
.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google
Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send
an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/9b677166-bd29-4ef9-864a-480b5b09d565%40googlegroups.com
https://groups.google.com/d/msgid/elasticsearch/9b677166-bd29-4ef9-864a-480b5b09d565%40googlegroups.com?utm_medium=email&utm_source=footer
.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to a topic in the
Google Groups "elasticsearch" group.
To unsubscribe from this topic, visit
https://groups.google.com/d/topic/elasticsearch/bcmrhRmh_r4/unsubscribe.
To unsubscribe from this group and all its topics, send an email to
elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/CAEYi1X94dLvjrV_2Vina9Z1uas7mmxWBeDdo32zHxet9qtDq9g%40mail.gmail.com
https://groups.google.com/d/msgid/elasticsearch/CAEYi1X94dLvjrV_2Vina9Z1uas7mmxWBeDdo32zHxet9qtDq9g%40mail.gmail.com?utm_medium=email&utm_source=footer
.

For more options, visit https://groups.google.com/d/optout.

--

Regards,
Sagar Shah

Too many people think more of security instead of opportunity. They seem
more afraid of life than death!!!

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/CANZnv5iCnbF7yhYTA7mWB1sB2%2Bt45RvGkrmf91tqjkRArLhncg%40mail.gmail.com
https://groups.google.com/d/msgid/elasticsearch/CANZnv5iCnbF7yhYTA7mWB1sB2%2Bt45RvGkrmf91tqjkRArLhncg%40mail.gmail.com?utm_medium=email&utm_source=footer
.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAEYi1X-KLe5Mh62nKkYNce9ZH%3DD0qtWyfHgzUSrLJiY9PPWACA%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.