Increasing ES search throughput

sumitborar · August 15, 2012, 7:04pm

I have 15 node cluster with 14 data nodes and 1 master node. We are
indexing about 300M documents with 7 shards and 1 replica. Each node has
24G of RAM . Right now I am using mmapfs for index store and ES is using
only 3Gb RAM ( out of 6GB assigned ) .

I am using simple text queries for search and we get an average response
time of 200ms . However our throughput is limited to about 100QPS @20
concurrent threads. Increasing number of threads doesn't seem to help
afterwards as CPU load becomes 100%.

Any suggestions on how to improve the throughtput ?

--

Radu_Gheorghe1 · August 16, 2012, 4:21am

You can increase the number of replicas and see if that helps.

The other option I see is to use routing to direct the query to the shard
that contains your results. Here's a video about doing that and some more:

On Wednesday, August 15, 2012 10:04:51 PM UTC+3, sumit wrote:

I have 15 node cluster with 14 data nodes and 1 master node. We are
indexing about 300M documents with 7 shards and 1 replica. Each node has
24G of RAM . Right now I am using mmapfs for index store and ES is using
only 3Gb RAM ( out of 6GB assigned ) .

I am using simple text queries for search and we get an average response
time of 200ms . However our throughput is limited to about 100QPS @20
concurrent threads. Increasing number of threads doesn't seem to help
afterwards as CPU load becomes 100%.

Any suggestions on how to improve the throughtput ?

--

sumitborar · August 16, 2012, 5:24pm

Thanks for the reply . I tried adding more replicas , it does help little
bit ( from 100 to 125 QPS ) but my CPU usage goes up really high.

I dont know how we can implement routing since we are using analyzer on the
String field and searching on them. I was under the impression that I cant
use routing for analyzed fields .

We did manage to reduce GC cycles by increasing the new object heap memory
size.

On Wednesday, August 15, 2012 9:21:12 PM UTC-7, Radu Gheorghe wrote:

You can increase the number of replicas and see if that helps.

The other option I see is to use routing to direct the query to the shard
that contains your results. Here's a video about doing that and some more:

Elasticsearch Platform — Find real-time answers at scale | Elastic

On Wednesday, August 15, 2012 10:04:51 PM UTC+3, sumit wrote:

I have 15 node cluster with 14 data nodes and 1 master node. We are
indexing about 300M documents with 7 shards and 1 replica. Each node has
24G of RAM . Right now I am using mmapfs for index store and ES is using
only 3Gb RAM ( out of 6GB assigned ) .

I am using simple text queries for search and we get an average response
time of 200ms . However our throughput is limited to about 100QPS @20
concurrent threads. Increasing number of threads doesn't seem to help
afterwards as CPU load becomes 100%.

Any suggestions on how to improve the throughtput ?

--

Radu_Gheorghe1 · August 22, 2012, 7:40pm

On Thursday, August 16, 2012 8:24:53 PM UTC+3, sumit wrote:

Thanks for the reply . I tried adding more replicas , it does help little
bit ( from 100 to 125 QPS ) but my CPU usage goes up really high.

I would try with less shards and more replicas, if reindexing is an option
and inserting isn't very heavy. And I'd run this sort performance testing
on a separate environment.

Also, I would optimize the index regularly.

I dont know how we can implement routing since we are using analyzer on
the String field and searching on them. I was under the impression that I
cant use routing for analyzed fields .

As far as I understand, the routing value itself is not analyzed, which is
different than how you map your "data" fields. But I might be wrong, so I
suggest to try with some sample data and see if you can make it work.

We did manage to reduce GC cycles by increasing the new object heap memory
size.

On Wednesday, August 15, 2012 9:21:12 PM UTC-7, Radu Gheorghe wrote:

You can increase the number of replicas and see if that helps.

The other option I see is to use routing to direct the query to the shard
that contains your results. Here's a video about doing that and some more:

Elasticsearch Platform — Find real-time answers at scale | Elastic

On Wednesday, August 15, 2012 10:04:51 PM UTC+3, sumit wrote:

I have 15 node cluster with 14 data nodes and 1 master node. We are
indexing about 300M documents with 7 shards and 1 replica. Each node has
24G of RAM . Right now I am using mmapfs for index store and ES is using
only 3Gb RAM ( out of 6GB assigned ) .

I am using simple text queries for search and we get an average response
time of 200ms . However our throughput is limited to about 100QPS @20
concurrent threads. Increasing number of threads doesn't seem to help
afterwards as CPU load becomes 100%.

Any suggestions on how to improve the throughtput ?

--

Topic		Replies	Views
Concurrent Search in elasticsearch Elasticsearch	7	2135	July 5, 2017
How to increase indexing speed? Elasticsearch	5	5290	April 18, 2017
Slow search response time (low CPU utilization) Elasticsearch	7	3394	July 31, 2019
Improving Elasticsearch performance on a single node by increasing shards Elasticsearch	4	6666	July 6, 2017
Scaling ES for search Elasticsearch	4	375	June 18, 2019

Increasing ES search throughput

Related topics