Elastic Query Bool Logic/Subquery Structure Help

Need help converting some SQL into elastic and stuck on some of the bool logic and probably isn't structured properly. Goal is to get a count of users that are active at a specified start date and are still active at the specified end date.

Sample data. Result should return a count of 2 as only A and B meet the criteria

DATE              FIELD         USERNAME
2020-10-01        START          A
2020-10-01        START          B
2020-10-01        START          C
2021-09-31        END            A
2021-09-31        END            B
2021-12-31        END            C
select count(*) from TABLE
where FIELD = 'START' and DATE = '2020-10-01'
and USERNAME IN (select USERNAME from TABLE where FIELD = 'END' and DATE = '2021-09-31'

The following is what I have so far and it's definitely wrong. I initially tried to mimic the subquery by nesting the second bool inside the first, but that had way too many rows. This is getting closer, but I'm still missing something.
Created with the intent of using Vega with Context on, so no query block.

{
  "size": 0,
  "aggs": {
    "countingusers": {
      "filter": {
        "bool": {
          "should": [
            {
              "bool": {
                "must": [
                  {
                    "range": {
                      "DATE": {
                        "gte": "2020-10-01",
                        "lte": "2020-10-31"
                      }
                    }
                  },
                  {
                    "match": {
                      "FIELD": "START"
                    }
                  }
                ]
              }
            },
            {
              "bool": {
                "must": [
                  {
                    "range": {
                      "DATE": {
                        "gte": "2021-09-01",
                        "lte": "2021-09-31"
                      }
                    }
                  },
                  {
                    "match": {
                      "FIELD": "END"
                    }
                  }
                ]
              }
            }
          ]
        }
      },
      "aggs": {
        "countusers": {
          "value_count": {
            "field": "USERNAME"
          }
        }
      }
    }
  }
}

I suggest you explore transforms in Elasticsearch that will help to compute first start date and last end date per user(your entity) and store summary in a new index that you can query

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.