Count based on a field instead of the number of entries

(Guillaume Baudringhien) #1

Hi all,
Thanks in advance for helping me out with that.

My tricky question is how can I do a count on a specific field (let's same 'names') and not on the number of entries ? I'd like to count how many names do I have for a specific timestamp matching with my filters but I have multiple entries with the same name. So the count is actually showing me the number of entries.

For example, with the index below :

{'_id':'1'; 'names':'user1'; 'status':'accepted'}
{'_id':'2'; 'names':'user1'; 'status':'failed'}
{'_id':'3'; 'names':'user2'; 'status':'accepted'}
{'_id':'4'; 'names':'user2'; 'status':'failed'}
{'_id':'5'; 'names':'user1'; 'status':'accepted'}
{'_id':'6'; 'names':'user2'; 'status':'failed'}

A count where status=accepted will give me 3, but if I want the number of user accepted, I need to do a count on 'names' and I'll get 2 (this is a rubbish example by the way)

I hope I'm clear enough
Thanks
Regards
Guillaume

#2

Hmm, maybe something like this where cardinality aggregation (1) will include the number 2 you need?

POST my_index/_search
{
  "aggs": {
    "1": {
      "cardinality": {
        "field": "names.keyword"
      }
    }
  },
  "size": 0,
  "query": {
    "bool": {
      "filter": [
        {
          "bool": {
            "should": [
              {
                "match": {
                  "status": "accepted"
                }
              }
            ],
            "minimum_should_match": 1
          }
        }
      ]
    }
  }
}

Response:

{
  "took" : 0,
  "timed_out" : false,
  "_shards" : {
    "total" : 1,
    "successful" : 1,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
    "total" : {
      "value" : 3,
      "relation" : "eq"
    },
    "max_score" : null,
    "hits" : [ ]
  },
  "aggregations" : {
    "1" : {
      "value" : 2
    }
  }
}
(Guillaume Baudringhien) #3

Hi @azasypkin,
Thanks a lot for your reply, it looks like exactly what I wanted
I added few 'match' more in your query to add different status in my result and it's still fine because they behave like a 'or' => (status=sts1 or status=sts2 or status=sts3) will be in my result.
My last question would be, what solution do I have if I want to apply another filter on another field like a and this time. Something like : "count the records where 'years':'2019' and status=accepted or status=tryagain, etc ..." ?
Any way ?
Thanks again

#4

If you're still learning Elasticsearch queries, I can show how you can figure that out by yourself using Kibana's Inspector :wink:

Best,
Oleg

(Guillaume Baudringhien) #5

:heart_eyes::heart_eyes::heart_eyes:
That's lovely !!
Didn't know this possibility to create such queries

This makes me think about another thing, is it possible to display those results let's say in a vertical bars graph using different queries ?

#6

You can do lots of stuff with bar chart and another visualizations, but it highly depends on your particular use case.

using different queries ?

Not sure what you mean exactly here, but I'd encourage you to just experiment with bucket aggregations, sub-buckets etc. And then if you have more concrete questions open another thread at Discuss so that others can engage as well.

The forum is also full of real world examples and answers if you ask the right questions :slight_smile:

Best,
Oleg

(Guillaume Baudringhien) #7

Alright,

I'll go for some doc then
Thanks a lot for your help

Reg
Guillaume