Memory problems while querying

00PAD00 · September 3, 2018, 3:09pm

Hi,

I am running a query (below) on my cluster which is heavily impacting it since its status turns red. The cluster has 10 data nodes.

    {  
        'size':0,
        'query':{  
        'bool':{  
         'must_not':[],
         'must':[  
            {  
               'query_string':{  
                  'analyze_wildcard':'true',
                  'query':'ID:/2[0-9]{11}/ AND NOT timestamp:[1535414400000 TO 1535435999599]'
               }
            },
            {  
               'range':{  
                  'timestamp':{  
                     'gte':1535436000000,
                     'lte':1535437799999,
                     'format':'epoch_millis'
                  }
               }
            }
         ]
      }
   },
   '_source':{  
      'excludes':[  
      ]
   },
   'aggs':{  
      'devices':{  
         'terms':{  
            'field':'ID',
            'order':{  
               '_count':'desc'
            },
            'size':26810
         },
         'aggs':{  
            'timestamps':{  
               'date_histogram':{  
                  'field':'timestamp',
                  'interval':'15m',
                  'time_zone':'Europe/Berlin',
                  'min_doc_count':1
               },
               'aggs':{  
                  'field1':{  
                     'filters':{  
                        'filters':{  
                           'field1_meq-33_or_leq-63':{  
                              'query_string':{  
                                 'query':'FIELD1:<-63 OR FIELD1:>-33',
                                 'analyze_wildcard':'true'
                              }
                           },
                           'field1_total':{  
                              'query_string':{  
                                 'query':'*',
                                 'analyze_wildcard':'true'
                              }
                           }
                        }
                     }
                  }
               }
            }
         }
      }
   }
}

The index has a daily format, which means that for the specified timestamp only one index is queried. It has 10 shards, each one being around 50GB and no replicas.

Each data node has 64GB of RAM and 32GB for heap size.

Below you can also see a picture of the JVM heap size for one of the data nodes, during the period the query is performed (around 11h50).

I have been checking multiple posts about these OOM problems to try to understand it. I believe the problem is the query itself, since it is aggregating over many documents (this time range has 5.235.160):

Can someone please help me troubleshoot this issue a bit more? Any other idea why this may be happening? Any solution, such as modifying the query into a "better" one?

Thank you in advance!

loren · September 4, 2018, 4:00pm

The nested aggregations you are doing look strange to me, especially around the filters. Can you explain in words what the intent is there?

On the query portion at the top, there are simpler ways to go about this. It looks like you just want ID's that start with 2 in a given date range.

Get rid of must_not and the source excludes since you aren't using them.
Change must to filter since you aren't scoring anything.
Delete AND NOT timestamp... since the range clause makes it impossible to ever match.
Change the remaining query_string to a simple prefix query.

Next I would delete the field1 sub-agg, re-run the query, and see how your heap is behaving.

Finally, 26810 buckets is a lot, so consider breaking the agg into partitions. That will help reduce memory pressure.

Good luck!

00PAD00 · September 10, 2018, 8:58am

Hi Loren,

Thank you very much for your quick reply!

We followed your suggestions and we finally have a much lighter query that is not impacting the cluster!

Once again, thank you very much for your help!

loren · September 10, 2018, 4:20pm

Great news! Glad to be of help.

system · October 8, 2018, 4:27pm

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
A memory intensive query crashes an elasticsearch node Elasticsearch	1	1058	July 5, 2017
This query brought down my cluster :{"query": {"function_score": {"query": {"match_all":{}}, "random_score": {"seed": 123456}}}} Elasticsearch	2	702	July 6, 2017
How to troubleshoot memory problems Elasticsearch	6	956	August 9, 2017
Elasticsearch 1.4.2 JVM memory leak? Elasticsearch	2	595	July 6, 2017
How to find out memory usage while aggregating some fields? Elasticsearch	5	575	April 15, 2020

Memory problems while querying

Related topics