How to make elastic querying faster

divyang · August 2, 2019, 7:11am

Hi , I need to query a lot of data for a given elastic index . I cannot use fielddata true on the required index fields as it will increase the size of cached memory. Currently , it is taking approx 10 min to run the query for a particular application using partitions as querying all in one query gives outofMemory error .I want to reduce the time taken for querying indexes . Any suggestions ?

Christian_Dahlqvist · August 2, 2019, 7:30am

What is the query? What does the data look like? How many indices and shards are you querying? How much data do these hold?

divyang · August 2, 2019, 7:38am

Query is :{
"from": 0,
"size": 0,
"query": {
"bool": {
"filter": [
{
"bool": {
"must": [
{
"match_phrase": {
"app_id": {
"query": "APPID"
}
}
},
{
"range": {
"collector_tstamp": {
"from": "FROMDATE",
"to": "TODATE"
}
}
}
]
}
}
]
}
},
"aggregations": {
"page_urlpath": {
"terms": {
"field": "page_urlpath.keyword",
"size": 2147483647,
"include": {
"partition": "PARTITION_NUMBER",
"num_partitions": "TOTAL_PARTITIONS"
}
},
"aggregations": {
"visitors": {
"cardinality": {
"field": "domain_userid.keyword",
"precision_threshold": 40000
}
},
"visits": {
"cardinality": {
"field": "domain_sessionid.keyword",
"precision_threshold": 40000
}
},
"number_of_events": {
"value_count": {
"field": "_index"
}
}
}
}
}
}

it queries for the entire month together . The month has a doc count of 6392551 records . It has 3800 different buckets for field page_urlpath . 2 nodes per node 1013 shards .

Christian_Dahlqvist · August 2, 2019, 7:57am

You should never set the size parameter to unnecessarily large values as it will use a lot of heap. See this old blog post for a discussion on this.

divyang · August 2, 2019, 10:30am

Well i changed that , definitely a good link to read , but perfomance still remains the same , will creating number of threads affect the heap ?

Christian_Dahlqvist · August 3, 2019, 5:30am

It sounds like you have far, far to many shards given the amount of data you have. Please read this blog post and then try to dramatically reduce the number of shards in the cluster. I would expect having to query only a few shards to give much better performance.

system · August 31, 2019, 5:30am

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Need advise on increasing the elastic search performance Elasticsearch	1	451	December 26, 2016
How to increase efficiency of search queries in Elasticsearch Elasticsearch	38	1586	July 15, 2019
Need help in improving search speed (about 20s) (elastic 6.3) Elasticsearch	8	463	June 23, 2019
Improve Query Performance Elasticsearch	10	425	July 6, 2017
Why elasticsearch's query time per shard take little while the total query time is long Elasticsearch	7	570	July 3, 2020

How to make elastic querying faster

Related topics