I am running elastic search in my personal box.
Memory: 6GB
Processor: Intel® Core™ i3-3120M CPU @ 2.50GHz × 4
OS: Ubuntu 12.04 - 64-bit
ElasticSearch Settings: Only running locally
Version : 1.2.2
ES_MIN_MEM=3g
ES_MAX_MEM=3g
threadpool.bulk.queue_size: 3000
indices.fielddata.cache.size: 25%
http.compression: true
bootstrap.mlockall: true
script.disable_dynamic: true
cluster.name: elasticsearch
index size: 252MB
Scenario: I am trying to test the performance of my bulk queries/aggregations. The test case is to run asynchronous http requests to node.js which in turn will call elastic search. The tests are running from a Java method. Started with 50 requests at a time. Each request is divided and parallized in to two asynchronous(async.parallel) bulk queries in node.js. I am using node-elasticsearch api (uses elasticsearch 1.3 api). The two bulk queries contain 13 and 10 queries respectively.And the two are asynchronously sent to elastic search from node.js. When the Elastic Search returns, the query results are combined and sent back to the test case.
ElasticSearch Mapping,
ElasticSeach Sample Record,
ElasticSearch First Bulk Query,
ElasticSearch Second Bulk Query
The two bulk queries are run in parallel. I am sending 50 concurrent requests from a Java test case to NodeJs. In NodeJs, each request is divided in to the above two bulk requests and ran in parallel. The response from these two are combined and sent back to the test case. The time taken for all 50 is 30 secs. And single request is taking 1.4 seconds.
Observations: I see that all the cpu cores are utilized 100%. Memory is utilized around 90%. The response time for all 50 requests combined is 30 seconds. If I run just the single queries each alone, in the bulk queries, each are returning in less than 100 milli-seconds. Node.js is taking negligible time to forward requests to elastic search and combine responses from elastic search. Even if run the test case synchronously from java, the response time does not change. I may say that elastic search is not doing parallel processing. Is this because I am CPU or memory bound? One more observation: if I change heap size for elastic search from 1 - 3GB, the response time does not change.
Also I am pasting top command output:
top - 18:04:12 up 4:29, 5 users, load average: 5.93, 5.16, 4.15
Tasks: 224 total, 3 running, 221 sleeping, 0 stopped, 0 zombie
Cpu(s): 98.2%us, 1.0%sy, 0.0%ni, 0.8%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st
Mem: 5955796k total, 5801920k used, 153876k free, 1548k buffers
Swap: 6133756k total, 708336k used, 5425420k free, 460436k cached
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
17410 root 20 0 7495m 3.3g 27m S 366 58.6 5:09.57 java
15356 rmadd 20 0 1015m 125m 3636 S 19 2.2 1:14.03 node
Questions: Is this expected, because I am running Elastic Search in my local machine and not in a cluster? Can I improve my performance in my local machine? I would definitely start a cluster. But I want to know, how to improve the performance scalably. What is it that the elastic search is bound to?
I am not able to find this in forums. And am sure this would help others. Thanks for your help.