Can't Filter Aggregation Results

ryes · October 21, 2019, 2:15pm

Hello, all!

I have an index with web transaction logs. It consists of requests and responses.
An index looks like this:

{ request_id: '1234', message: 'request', path: '/fare' }
{ request_id: '1234', message: 'response', status: 202 }
{ request_id: '5789', message: 'request', path: '/orders' }
{ request_id: '5789', message: 'response', status: 202 }
{ request_id: '4512', message: 'request', path: '/orders' }
{ request_id: '4512', message: 'response', status: 500 }

I need to count the number of failed (status != 202) '/orders' requests. But I can't figure out how to write a query.

I grouped requests-responses by request_id but then couldn't find out how to filter aggregation results by request's path and response's status. Also, I tried to find something in Pipeline Aggregation. There is a Bucket Selector Aggregation in docs, but the problem is that it works only with numeric metrics. So, I'm stuck(

Is it possible to achieve my goal at all? If so, can you give me a direction, please?
Or maybe it would be better to reconsider index design?

Thanks in advance!

Glen_Smith · October 21, 2019, 4:57pm

You can use a Boolean query. must_not section to eliminate status=202, filter section to match message=response and path=/orders.
Since you only want count, you can hit the _count endpoint instead of _search.

ryes · October 23, 2019, 6:27am

Hello, Glen. Thank you for the response. I appreciate your help! If I get your suggestion correctly, then I can't apply it to my problem. Let me explain why.

My task is to count failed transactions for a specific path. One transaction produces two log entries: request and response. A request entry contains :path field and response has :status field. Because these fields are placed in separate entities, I can't simultaneously filter section with match message=response and path=/orders. I need to aggregate it first. But when I aggregate requests and responses into buckets by request_id (one bucket == one transaction) I lose immediate access to :path and :status fields.

I've recently *tried to achieve a part of desired result with next steps:

Aggregate requests and responses into buckets by :request_id
Add :path attribute to newly formed buckets using scripted_metric
Filter resulted buckets using Bucket Selector Aggregation

But this didn't work out. Resulting response contained empty :buckets field.

So, my questions are still stand:

Is it possible to achieve my goal at all?
Maybe I should reconsider index design? Merge request and responses into single entity, for example.

*my failed attempt

GET logs/_search
{
  "size": 0,
  "aggs": {
    "request_id": {
      "terms": { "field": "request_id" },
      "aggs": {
        "path": {
          "scripted_metric": {
            "map_script": """
              if (doc['path'].size() != 0) {
                state.payload = doc['path'].value;
              }
            """,
            "combine_script": "return state.payload;",
            "reduce_script": "return states[0] == '/orders' ? 1 : 0;"
          }
        },
        "bucket_filter": {
          "bucket_selector": {
            "buckets_path": {
              "path": "path.value"
            },
            "script": "params.path == 1"
          }
        }
      }
    }
  }
}

P.S.: And sorry for such verbosity, I'm just trying to express myself as clear as I can.

Glen_Smith · October 23, 2019, 4:51pm

Sorry, yeah, I should have noticed the requests and responses are in distinct docs.

(The detail and clarity - "verbosity" - is always much appreciated.)

Yes, it would be a very good idea to merge these into a single document. The scenario you're trying to accomplish is resounding testimony to the benefit. Elasticsearch's join-like capabilities are limited to nested documents and parent/child; with this data set, just merging the response into the request document is sufficient.

If you stand up a Kibana instance, you can load sample data. One of the sample data sets is Web Logs, which you'll find is supported by a single index, in which the documents contain all the request info plus the response.

ryes · October 24, 2019, 12:57pm

Wow! This sample data is really great! Thank you for your help! Amazing support.

system · November 21, 2019, 12:57pm

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Pipeline Aggregation and Counting Buckets Elasticsearch	1	633	April 18, 2019
Getting count using aggregation Elasticsearch	5	369	October 20, 2022
Filtering on the result of an aggregation Elasticsearch	3	469	July 5, 2017
Filter based on the doc_count with aggregations (2) Elasticsearch	2	4901	October 5, 2018
Alerting on index aggregation with filter Kibana elastic-stack-alerting	2	1140	September 6, 2022

Can't Filter Aggregation Results

Related topics