Convert SQL GROUP BY To Aggregation with Pagination

Hi, I'm new to ElasticSearch. I'm trying to convert some queries I'm doing using Spring Data to ES usign High level client API.

What I'm doing is grouping data to show a filter menu (like in Google Sheet) as the one in the picture:

filter

The use then can choose one or more values to filter data.

To display the filtering menu I need to do a GROUP BY query and a COUNT() because, as you can see, I show total results in the footer.
Furthemore I need to page results because I can have also hundreds of groups.

In Java I solved all that with 2 simple queries:

 TypedQuery<String> query = entityManager.createQuery("SELECT CAST(" + property + " as string) FROM Contact c " + where + groupBy + orderBy, String.class);
TypedQuery<Long> queryCount = entityManager.createQuery("SELECT COUNT(*) FROM Contact c " + where + groupBy, Long.class);

In the query I do also have a WHERE with filters LIKE %%.

The first question is if I can reach the same result with ES.
I tried to use this code but I've several problems - maybe I'm still thinking in the SQL way -

 SearchSourceBuilder searchBuilder = new SearchSourceBuilder();
        for (Filter filter : filters) {
            searchBuilder.query(QueryBuilders.fuzzyQuery(filter.getProperty(), filter.getValues().stream().map(v -> v.toString()).collect(Collectors.joining(" "))));
        }
        searchBuilder.from(pageable.getPageNumber());
        searchBuilder.size(pageable.getPageSize());//---->HAS NOT EFFECT
        searchBuilder.timeout(new TimeValue(15, TimeUnit.SECONDS));
        searchBuilder.sort(new ScoreSortBuilder().order(SortOrder.DESC));//-->NOT CLEAR ON WHAT WORKS
        // we don't need the content
        searchBuilder.fetchSource(false);

        // @formatter:off
        TermsAggregationBuilder aggregation = AggregationBuilders
                .terms("by_property")
                .field(property + ".keyword");
        // @formatter:on
        /**
         * SORTING ON AGGREGATION
         */
        Iterator<Sort.Order> sortIterator = pageable.getSort().iterator();
        while (sortIterator.hasNext()) {
            Sort.Order order = sortIterator.next();
            if (StringUtils.equalsIgnoreCase(order.getProperty(), property)) {
                switch (order.getDirection()) {
                    case ASC:
                        aggregation.order(BucketOrder.key(true));
                        break;
                    case DESC:
                        aggregation.order(BucketOrder.key(false));
                        break;
                }
            }
        }
        searchBuilder.aggregation(aggregation);
        SearchRequest searchRequest = Requests.searchRequest(indexName).allowPartialSearchResults(true)
                .source(searchBuilder);
        SearchResponse searchResponse = elasticSearchClient.search(searchRequest, RequestOptions.DEFAULT);
        log.info("Response {}", searchResponse);

        // sr is here your SearchResponse object
        Terms groupedProperties = searchResponse.getAggregations().get("by_property");

        // For each entry
        for (Terms.Bucket entry : groupedProperties.getBuckets()) {
            entry.getKey(); // Term
            entry.getDocCount(); // Doc count
            log.info("Key {} Count {}", entry.getKey(), entry.getDocCount());
            results.add((String) entry.getKey());
            totalResults++;
        }

There are few things not clear:

  • I'm not able to get a paged result. I don't get why I've .size() both on SearchSourceBuilder and on TermsAggregationBuilder. It seems just the TermsAggregationBuilder.size() has effect in my case. Furthemore it seems I don't havefrom() on TermsAggregationBuilder, so I can't move on with next page. Please note my application is a Spring Boot REST app stateless so I can't use the ScrollApi here.
  • My fuzzy search doesn't work. In the widget the use can type part of the name or the entire name. I saw some discussion about how ES indexes fields and I've also the .keyword version of my field because it is a text field. However my filters don't work.
  • I didn't see a way to get total number of grouped values in order to say "I'm showing you 10 of 300 elements".

Finally I'd like to know if what I'm doing the if suggested way to group by a field on 800k rows.

Thanks

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.