Slow queries during users peak

Nerijus_Oftas · June 23, 2018, 7:39pm

Hi,
We have project with a lot of users in website. All site information we store in elaticsearch in AWS i3.8xlarge.elasticsearch 8 instances. We have one index and in this index 7 types. Total documents count is 80millions. I think this number of documents is not big for elasticsearch. But when we get users peak, elasticsearch become so slow and we get timeouts from AWS ELB. Elasticsearch intances CPU increase to 95%. maybe you can provide us suggestions or what we can tweak to get better performance. I checked queries count in each website page. We have 1-3 queries per page.

Thanks if you will be able to help me.
More info about our cluster:
3 Master nodes (1 active - the other 2 in case of failing)
11 data nodes
Default configuration:
1 index / 5 shards
We've changed from 1 to 2 replicas

Christian_Dahlqvist · June 23, 2018, 8:32pm

How large is the index? How large are your documents? What type of queries are you using? How many queries per second are you serving? Which version of Elasticsearch are you using?

Nerijus_Oftas · June 23, 2018, 9:21pm

Thanks for quick answer. We are using Elastic search 5.6 version. I attached file with few query examples which we are using. Screenshot by Lightshot Most of our queries are similar to these.

Index size 63.6gb at this moment.
Document size is not big (about 0,81KB / Document). Document contains 10-20 attributes. These attributes are integers or keywords.
Actually I do not know how many queries per second we serve. We can not enable logging on prod, because this will impact performance.
But like I said before we have 1-3 queries per page load. And AWS load balancer shows 120 000 queries per minute.
Hope this info will be enough to help me.

I use this library GitHub - elastic/elasticsearch-php at 5.0

Christian_Dahlqvist · June 24, 2018, 6:21am

If I understand this correctly, you have a single index in the cluster with 2 replicas (15 shards in total) across 11 data nodes. This means that each node only have 1 or 2 shards. As your data set is reasonably small, have you tried to increase the replica count further to spread out load better?

Can you also provide the output of the cluster stats API.

How frequently are you updating or indexing data?

Nerijus_Oftas · June 24, 2018, 11:08am

Yes. single index and 2 replicas and 11 nodes. Every hour we update about 600 000 different documents. Also we get about 200 000- 300 000 new documents records in a hour. We did not try to increase replica count anymore.

Here is a cluster stats:
{"_nodes":{"total":11,"successful":11,"failed":0},"cluster_name":"116743035446:predictor-production","timestamp":1529837959723,"status":"green","indices":{"count":3,"shards":{"total":27,"primaries":11,"replication":1.4545454545454546,"index":{"shards":{"min":2,"max":15,"avg":9.0},"primaries":{"min":1,"max":5,"avg":3.6666666666666665},"replication":{"min":1.0,"max":2.0,"avg":1.3333333333333333}}},"docs":{"count":82749828,"deleted":37320519},"store":{"size":"195.6gb","size_in_bytes":210099207551,"throttle_time":"0s","throttle_time_in_millis":0},"fielddata":{"memory_size":"0b","memory_size_in_bytes":0,"evictions":0},"query_cache":{"memory_size":"1.2gb","memory_size_in_bytes":1367228960,"total_count":466595045,"hit_count":157210692,"miss_count":309384353,"cache_size":472252,"cache_count":3991874,"evictions":3519622},"completion":{"size":"0b","size_in_bytes":0},"segments":{"count":481,"memory":"502.5mb","memory_in_bytes":526915166,"terms_memory":"380.7mb","terms_memory_in_bytes":399237280,"stored_fields_memory":"65.3mb","stored_fields_memory_in_bytes":68494840,"term_vectors_memory":"0b","term_vectors_memory_in_bytes":0,"norms_memory":"30kb","norms_memory_in_bytes":30784,"points_memory":"50.9mb","points_memory_in_bytes":53396362,"doc_values_memory":"5.4mb","doc_values_memory_in_bytes":5755900,"index_writer_memory":"52.5mb","index_writer_memory_in_bytes":55091103,"version_map_memory":"29.6kb","version_map_memory_in_bytes":30400,"fixed_bit_set":"0b","fixed_bit_set_memory_in_bytes":0,"max_unsafe_auto_id_timestamp":1529783290728,"file_sizes":{}}},"nodes":{"count":{"total":11,"data":8,"coordinating_only":0,"master":3,"ingest":8},"versions":["5.5.2"],"os":{"available_processors":524,"allocated_processors":268,"names":[{"count":11}],"mem":{"total":"3.7tb","total_in_bytes":4149041672192,"free":"3.2tb","free_in_bytes":3577610948608,"used":"532.1gb","used_in_bytes":571430723584,"free_percent":86,"used_percent":14}},"process":{"cpu":{"percent":30},"open_file_descriptors":{"min":916,"max":2009,"avg":1705}},"jvm":{"max_uptime":"15.5h","max_uptime_in_millis":55920913,"mem":{"heap_used":"116.7gb","heap_used_in_bytes":125343354344,"heap_max":"256.1gb","heap_max_in_bytes":274993512448},"threads":4352},"fs":{"total":"109.7tb","total_in_bytes":120662917029888,"free":"109.5tb","free_in_bytes":120435399479296,"available":"109.5tb","available_in_bytes":120435214929920},"network_types":{"transport_types":{"netty4":11},"http_types":{"filter-jetty":11}}}}

Christian_Dahlqvist · June 25, 2018, 6:47am

Typically you increase query throughput in Elasticsearch by scaling out the number of replica shards. The trade-off is naturally that this requires more effort when indexing and updating data. I would recommend slowly increasing the number of replicas to see what effect it has unless you have a separate cluster to run benchmarks on.

Nerijus_Oftas · June 25, 2018, 10:27am

Thanks for suggestion. Also If I will split my index to smaller. Will it help?

Christian_Dahlqvist · June 25, 2018, 10:28am

It is difficult to know if more, smaller shards will perform better or not, so the best way id probably to test or benchmark. It may make sense to align the number of primary shards with the number of data nodes you have.

Nerijus_Oftas · June 29, 2018, 2:42pm

We changed instance type and added more nodes. Seems that this helped. Also from AWS guys we get suggestion to reindex data. So we moved types to other index with more shards.

system · July 27, 2018, 2:42pm

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Performance problems Elasticsearch	12	586	July 6, 2017
Queries get slow while indexing documents Elasticsearch	9	1794	November 5, 2020
ElasticSearch Performance Elasticsearch	4	348	October 12, 2020
Slow search response time (low CPU utilization) Elasticsearch	7	3398	July 31, 2019
Elasticsearch 1.5 -> 5.5 performance degradation Elasticsearch	6	1303	December 4, 2017

Slow queries during users peak

Related topics