A. Reduce accuracy or
B. Break into multiple requests
For A consider lowering the precision_threshold in the cardinality agg [1]. The default value is 3,000 meaning each unique search term will count up to 3,000 session IDs each which is largely at the root of your memory problems given the number of unique search terms there are likely to be.
For option B we have a couple of ways of doing this. The first is to simply issue a query for the top 1,000 search_term.keyword values. This should give you an approximation for the search terms used in sessions. There will be false positives (search terms caused by outlier sessions repeating the same search) but no false negatives. Take this list of search terms and use them as a terms query in your existing example agg request.
Another option in 5.4 is to run multiple requests like your existing one but focusing each request on a subset of the search terms in your index. This can be done using the partitioning function [2] on the include clause in your terms query.
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.