Get number of unique results from multiple indices

NLSVTN · April 27, 2021, 2:59pm

I have 5 indices: all of them with the same schema, so each one of them represents a dataset among many datasets. Each one of them is around 1Tb of size. I want to issue a query to all of the indices at the same time and then count the number of the results removing the duplicate document ids (there will be many in between the indices). I am interested only in documents ids, just to determine the number of unique results, so that to show that to the user. How could I do that in an efficient manner? Is it possible at all with Elasticsearch? I can't just get all of the results for obvious reasons. A query can return all of the documents in an index.

warkolm · April 28, 2021, 1:26am

Sounds like a terms agg might do what you want?

NLSVTN · May 3, 2021, 7:37pm

I found the following solution:

    {
      "aggs": {
        "variants_count": {
          "cardinality": { "field": "variantId" } 
        }
      }
    }

With cardinality and it gives what I want. However, I have an issue: there are 4 identical (only indices are different) paged searches that are constructed using MultiSearch and I am not sure whether it is possible to do something like that in elasticsearch-dsl:

ms.aggs.bucket('variants_count', 'cardinality', field='variantId')

where ms is of type MultiSearch, but run over all pages and indices.

What should be the approach? How could achieve that?

Seems like I am looking for something like merging of MultiSearch into a single query.

system · May 31, 2021, 7:37pm

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
How do I determine the difference between two indices? Elasticsearch	2	8990	June 15, 2018
Elasticsearch get all unique Ids Elasticsearch	6	1298	January 18, 2022
Aggregations across multiple indices Elasticsearch	3	6126	July 6, 2017
Modify query to get 1 unique document per index? Elasticsearch	3	371	July 31, 2019
Aggregation count unique values Elasticsearch	5	11374	March 13, 2018

Get number of unique results from multiple indices

Related topics