Above query will always return single unique result
Now with the result,
thoughput is 252.44 per second with error of 1.15% connection timeout
As i think even if we use elasticsearch default configuration throughput of 252.44 is quite very low, i need to make it to atleast 1000 per second.
please suggest how the above can be done.
Thanks
I will look into this,
What I am thinking to troubleshoot this issue start with single index with default configuration and do the load testing.
Then increase the shard, index and nodes as per the requirement
Is it right approach to troubleshoot the issue ?
I would recommend storing the two pieces of information you are filtering on in separate keyword mapped fields and then use term queries instead if match phrase.
In order to give additional suggestions it would be good to know how many shards your data is distributed across and how much space this takes up on disk.
So you have 7 very small indices and a total of 14 shards? Why have you gone for having 7 indices instead of a single one?
As long as you do not have any mapping conflicts I would recommend that you reindex all your data into a single index and set the number of replicas so that all data nodes hold a copy of the data. Then send queries distributed across all data nodes with a local preference.
all the 7 indices have different purpose on application level so basically i cannot merge those indices into one.
so please suggest if anything else can be done.
one more thing when doing load testing with apache jmeter of 7000 concurrent users i am getting below error.
at org.apache.http.impl.conn.DefaultHttpClientConnectionOperator.connect(DefaultHttpClientConnectionOperator.java:156)
at org.apache.jmeter.protocol.http.sampler.HTTPHC4Impl$JMeterDefaultHttpClientConnectionOperator.connect(HTTPHC4Impl.java:326)
at org.apache.http.impl.conn.PoolingHttpClientConnectionManager.connect(PoolingHttpClientConnectionManager.java:374)
at org.apache.http.impl.execchain.MainClientExec.establishRoute(MainClientExec.java:393)
at org.apache.http.impl.execchain.MainClientExec.execute(MainClientExec.java:236)
at org.apache.http.impl.execchain.ProtocolExec.execute(ProtocolExec.java:186)
at org.apache.http.impl.execchain.RetryExec.execute(RetryExec.java:89)
at org.apache.http.impl.execchain.RedirectExec.execute(RedirectExec.java:110)
at org.apache.http.impl.client.InternalHttpClient.doExecute(InternalHttpClient.java:185)
at org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:83)
at org.apache.jmeter.protocol.http.sampler.HTTPHC4Impl.executeRequest(HTTPHC4Impl.java:850)
at org.apache.jmeter.protocol.http.sampler.HTTPHC4Impl.sample(HTTPHC4Impl.java:561)
at org.apache.jmeter.protocol.http.sampler.HTTPSamplerProxy.sample(HTTPSamplerProxy.java:67)
at org.apache.jmeter.protocol.http.sampler.HTTPSamplerBase.sample(HTTPSamplerBase.java:1282)
at org.apache.jmeter.protocol.http.sampler.HTTPSamplerBase.sample(HTTPSamplerBase.java:1271)
at org.apache.jmeter.threads.JMeterThread.doSampling(JMeterThread.java:627)
at org.apache.jmeter.threads.JMeterThread.executeSamplePackage(JMeterThread.java:551)
at org.apache.jmeter.threads.JMeterThread.processSampler(JMeterThread.java:490)
at org.apache.jmeter.threads.JMeterThread.run(JMeterThread.java:257)
at java.lang.Thread.run(Unknown Source)
Caused by: java.net.ConnectException: Connection timed out: connect
at java.net.DualStackPlainSocketImpl.connect0(Native Method)
at java.net.DualStackPlainSocketImpl.socketConnect(Unknown Source)
at java.net.AbstractPlainSocketImpl.doConnect(Unknown Source)
at java.net.AbstractPlainSocketImpl.connectToAddress(Unknown Source)
at java.net.AbstractPlainSocketImpl.connect(Unknown Source)
at java.net.PlainSocketImpl.connect(Unknown Source)
at java.net.SocksSocketImpl.connect(Unknown Source)
at java.net.Socket.connect(Unknown Source)
at org.apache.http.conn.socket.PlainConnectionSocketFactory.connectSocket(PlainConnectionSocketFactory.java:75)
at org.apache.http.impl.conn.DefaultHttpClientConnectionOperator.connect(DefaultHttpClientConnectionOperator.java:142)
... 19 more
10.72.21.40 is data node currently this is first node, currently second node is master node
Start with a low concurrency level and gradually increase as long as query latency is acceptable. That will give you an idea of the level of concurrent queries your cluster can handle. If you can not consolidate your indices, which would make querying far more efficient, you may need more CPU cores to be able to handle more load in parallel.
Regarding concurrent queries cluster can handle is around 5500 concurrent users.
for the more CPU core its already 4 core and if utilization is not spiked is there is any need of more core ?.
currently CPU utilization is not more than 45% that means CPU is still not utilized on its full potential, right ?
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.