What is elastic search bounded by? Is it cpu, memory etc

rmadd · December 2, 2014, 12:06pm

I am running elastic search in my personal box.

Memory: 6GB
Processor: Intel® Core™ i3-3120M CPU @ 2.50GHz × 4
OS: Ubuntu 12.04 - 64-bit

ElasticSearch Settings: Only running locally
Version : 1.2.2
ES_MIN_MEM=3g
ES_MAX_MEM=3g
threadpool.bulk.queue_size: 3000
indices.fielddata.cache.size: 25%
http.compression: true
bootstrap.mlockall: true
script.disable_dynamic: true
cluster.name: elasticsearch
index size: 252MB

Scenario: I am trying to test the performance of my bulk queries/aggregations. The test case is to run asynchronous http requests to node.js which in turn will call elastic search. The tests are running from a Java method. Started with 50 requests at a time. Each request is divided and parallized in to two asynchronous(async.parallel) bulk queries in node.js. I am using node-elasticsearch api (uses elasticsearch 1.3 api). The two bulk queries contain 13 and 10 queries respectively.And the two are asynchronously sent to elastic search from node.js. When the Elastic Search returns, the query results are combined and sent back to the test case.
ElasticSearch Mapping,
ElasticSeach Sample Record,
ElasticSearch First Bulk Query,
ElasticSearch Second Bulk Query

The two bulk queries are run in parallel. I am sending 50 concurrent requests from a Java test case to NodeJs. In NodeJs, each request is divided in to the above two bulk requests and ran in parallel. The response from these two are combined and sent back to the test case. The time taken for all 50 is 30 secs. And single request is taking 1.4 seconds.

Observations: I see that all the cpu cores are utilized 100%. Memory is utilized around 90%. The response time for all 50 requests combined is 30 seconds. If I run just the single queries each alone, in the bulk queries, each are returning in less than 100 milli-seconds. Node.js is taking negligible time to forward requests to elastic search and combine responses from elastic search. Even if run the test case synchronously from java, the response time does not change. I may say that elastic search is not doing parallel processing. Is this because I am CPU or memory bound? One more observation: if I change heap size for elastic search from 1 - 3GB, the response time does not change.

Also I am pasting top command output:

top - 18:04:12 up 4:29, 5 users, load average: 5.93, 5.16, 4.15
Tasks: 224 total, 3 running, 221 sleeping, 0 stopped, 0 zombie
Cpu(s): 98.2%us, 1.0%sy, 0.0%ni, 0.8%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st
Mem: 5955796k total, 5801920k used, 153876k free, 1548k buffers
Swap: 6133756k total, 708336k used, 5425420k free, 460436k cached

PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
17410 root 20 0 7495m 3.3g 27m S 366 58.6 5:09.57 java
15356 rmadd 20 0 1015m 125m 3636 S 19 2.2 1:14.03 node

Questions: Is this expected, because I am running Elastic Search in my local machine and not in a cluster? Can I improve my performance in my local machine? I would definitely start a cluster. But I want to know, how to improve the performance scalably. What is it that the elastic search is bound to?

I am not able to find this in forums. And am sure this would help others. Thanks for your help.

jprante · December 4, 2014, 1:03pm

Why do you set bulk indexing queue size to 3000?

Why do you limit field data cache to 25%?

What documents are in the index?

How do your queries look like?

Jörg

On Tue, Dec 2, 2014 at 1:06 PM, rmadd rmadd9@gmail.com wrote:

I am running Elasticsearch in my personal box.

Memory: 6GB
Processor: Intel® Core™ i3-3120M CPU @ 2.50GHz × 4
OS: Ubuntu 12.04 - 64-bit

Elasticsearch Settings: Only running locally
Version : 1.2.2
ES_MIN_MEM=3g
ES_MAX_MEM=3g
threadpool.bulk.queue_size: 3000
indices.fielddata.cache.size: 25%
http.compression: true
bootstrap.mlockall: true
script.disable_dynamic: true
cluster.name: elasticsearch

Scenario: I am trying to test the performance of my bulk
queries/aggregations. The test case is to run asynchronous http requests to
node.js which in turn will call Elasticsearch. The tests are running from
a
Java method. Started with 50 requests at a time. Each request is divided
and
parallized in to two asynchronous(async.parallel) bulk queries in node.js.
I
am using node-elasticsearch https://www.npmjs.org/package/elasticsearch
api (uses elasticsearch 1.3 api). The two bulk queries contain 13 and 10
queries respectively.And the two are asynchronously sent to Elasticsearch
from node.js. When the Elastic Search returns, the query results are
combined and sent back to the test case.

Observations: I see that all the cpu cores are utilized 100%. Memory is
utilized around 90%. The response time for all 50 requests combined is 30
seconds. If I run just the single queries each alone, in the bulk queries,
each are returning in less than 100 milli-seconds. Node.js is taking
negligible time to forward requests to Elasticsearch and combine responses
from Elasticsearch. Even if run the test case synchronously from java, the
response time does not change. I may say that Elasticsearch is not doing
parallel processing. Is this because I am CPU or memory bound?

Questions: Is this expected, because I am running Elastic Search in my
local machine and not in a cluster? Can I improve my performance in my
local
machine? I would definitely start a cluster. But I want to know, how to
improve the performance scalably. What is it that the Elasticsearch is
bound to?

I am not able to find this in forums. And am sure this would help others.
Thanks for your help.

--
View this message in context:
http://elasticsearch-users.115913.n3.nabble.com/What-is-elastic-search-bounded-by-Is-it-cpu-memory-etc-tp4067016.html
Sent from the Elasticsearch Users mailing list archive at Nabble.com.

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/1417522017795-4067016.post%40n3.nabble.com
.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAKdsXoG2yVtE1LrmRg5XDU5ZfLbU6z7ztcX_o3nLPbkhZBgY6A%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

rmadd · December 4, 2014, 5:32pm

I updated the post with more details. I added the gist urls. And I was playing with settings to see the behavior in general. I removed the field data cache and queue size. Please let me know, if you want more details.
jstack thread dump

Topic		Replies	Views
Our elastic search query performance is VERY low Elasticsearch	12	1649	May 11, 2017
Performance issue in my elastic search cluster Elasticsearch	8	516	September 26, 2019
ElasticSearch - Memory and Query Performance Elasticsearch	4	1706	July 6, 2017
Search performance Elasticsearch	5	345	July 6, 2017
Acceptable Search Performance Elasticsearch	2	1208	July 6, 2017

What is elastic search bounded by? Is it cpu, memory etc

Related topics