Kibana Discover and DSL query - unknown field [aggs] and unknown field [size]

RobertBM · July 10, 2024, 9:46pm

Hello team, I am currently facing an issue while trying to load DSL query result in the Kibana Discover tool.
The query was created with the help from this forum , which provided the correct result do a NOT IN case. However when trying to load the query in Kibana we are facing the following errors

unknown field [aggs]
unknown field [size]

This is the full details on the error from Kibana front end -

  "error": {
    "root_cause": [
      {
        "type": "parsing_exception",
        "reason": "unknown query [aggs]",
        "line": 1,
        "col": 631
      }
    ],
    "type": "x_content_parse_exception",
    "reason": "[1:631] [bool] failed to parse field [filter]",
    "caused_by": {
      "type": "parsing_exception",
      "reason": "unknown query [aggs]",
      "line": 1,
      "col": 631,
      "caused_by": {
        "type": "named_object_not_found_exception",
        "reason": "[1:631] unknown field [aggs]"
      }
    }
  }
}

the query is the following:

{
  "size": 0,
  "aggs": {
    "processing_ids": {
      "terms": {
        "field": "processing_id",
        "size": 10000
      },
      "aggs": {
        "status_processing": {
          "terms": {
            "field": "status_processing"
          },
          "aggs": {
            "latest_record": {
              "top_hits": {
                "size": 1,
                "sort": [
                  {
                    "start_processing": {
                      "order": "desc"
                    }
                  }
                ]
              }
            }
          }
        },
        "processing_id_filter": {
          "bucket_selector": {
            "buckets_path": {
              "docCount": "_count"
            },
            "script": "params.docCount == 1"
          }
        }
      }
    }
  }
}

i could only find one reference that Kibana Visualization does not support aggregation, can you please confirm? otherwise is there any workaround?

NOTE: the query runs perfectly on dev tools all i want to be able to bring the results to my dashboard.

this is a data sample:

POST /processing_records/_bulk
{"index":{}}
{"processing_id":"1234","file_name":"file1.xls","start_processing":"06/07/2024","status_processing":"processing"}
{"index":{}}
{"processing_id":"1234","file_name":"file1.xls","start_processing":"06/07/2024","end_processing":"06/07/2024","status_processing":"processed"}
{"index":{}}
{"processing_id":"1235","file_name":"file2.xls","start_processing":"06/07/2024","status_processing":"processing"}
{"index":{}}
{"processing_id":"1235","file_name":"file2.xls","start_processing":"06/07/2024","end_processing":"06/07/2024","status_processing":"processed"}
{"index":{}}
{"processing_id":"1236","file_name":"file2.xls","start_processing":"06/07/2024","status_processing":"processing"}
{"index":{}}
{"processing_id":"1236","file_name":"file2.xls","start_processing":"06/07/2024","end_processing":"06/07/2024","status_processing":"processed"}
{"index":{}}
{"processing_id":"1237","file_name":"file4.xls","start_processing":"06/07/2024","status_processing":"processing"}

my visualization in Kibana should only bring the processing ID records where the status processing is PROCESSING but does not have a status processing recorded as PROCESSED. In summary, the idea is to present only the records that are in PROCESSING status

carly.richmond · July 12, 2024, 10:17am

Hi @RobertBM,

To confirm, are you adding the above query into the discover filter bar? Do you have KQL or Lucene selected (see below)?

As covered in this older StackOverflow thread, the Discover screen doesn't support aggregations.

Can you share what kind of visualization you're trying to create from this aggregation? You could try something like Aggregation-based visualizations or Vega to build the visualization.

Hope that helps!

RobertBM · July 17, 2024, 6:07am

thanks for your reply @carly.richmond
all im looking is for a way to add a view to a dashboard where the records will have the following sample:

processingid , filename , status_process , start_process , end_process
1234 , file1.csv , processing , 1jul24 , empty
1234 , file1.csv , processed , 1jul24 , 1jul24
1235 , file2.csv , processing , 1jul24 , empty
1235 , file2.csv , processed , 1jul24 , 1jul24
1236 , file3.csv , processing , 1jul24 , empty
1237 , file4.csv , processing , 1jul24 , empty

in the sample above, i want my view to show only process IDs 1236 and 1237 because they are the ones that have a status PROCESSING and NOT a status PROCESSED.

when we do this in a simple SQL i can use a NOT IN clause and we would be done, and the DSL query can do this in dev console, but i could not find a way to build this view in Kibana using Lens or any other features i could find.

hopes this clarifies the solution im looking for.

PS: thanks for the hint about VEGA, but still would like to understand other options because im working with scripted fields which will not work in situation where you want to search an index.

Scripted fields are calculated on-the-fly during data exploration. This means the computed value doesn't exist within the actual Elasticsearch document itself

thanks again

RobertBM · July 22, 2024, 3:14pm

hello team, hope you can give me some pointers on how to apply a solution to this.
thank you

carly.richmond · July 23, 2024, 10:17am

Hi @RobertBM,

Thanks for following up.

all im looking is for a way to add a view to a dashboard where the records will have the following sample

If you're looking to build a table like the above based on your provided query you might be able to create something using a Table-based plot in Vega-Lite by adapting some of these examples. This Vega visualization can then be added to a dashboard.

in the sample above, i want my view to show only process IDs 1236 and 1237 because they are the ones that have a status PROCESSING and NOT a status PROCESSED.

For this logic you could add a bool query to look for the statuses you need and want to exclude, similar to the following (warning I've not executed this):

GET test/_search
{
    "size": 0,
    "query": {
      "bool": {
        "must": [
          {
            "match": {
              "status_processing": "processing"
            }
          }
          ],
          "must_not": [
            {
              "match": {
                "status_processing": "processed"
              }
            }
            ]
      }
    }
    "aggs": {
      // Aggs omitted
    }
}

PS: thanks for the hint about VEGA, but still would like to understand other options because im working with scripted fields which will not work in situation where you want to search an index.

On your point about scripted fields, I did come across this example showing how scripted fields can be used with Vega/ Vega-Lite visualizations that might be worth checking out.

Hope that helps!

RobertBM · July 24, 2024, 6:18pm

Thanks for your reply @carly.richmond , I continue to explore the Vega option and thanks for the references, they are good, strangely they are more focused on plot data than a simple table print such as the ones we can get in the discovery, but at this point seems clear that the Discovery feature will not support the aggregation, not sure if there is a plan for a feature like this in the future, it would make it simpler to bring more live data into the dashboards if more advanced (aggregation based) DSL queries would be allowed/interpreted.

The query you provided was previously tested and will not work since it will bring records that have the PROCESSING status but its not bringing the expected result, here is a result example of your suggested query

{
  "took": 1,
  "timed_out": false,
  "_shards": {
    "total": 1,
    "successful": 1,
    "skipped": 0,
    "failed": 0
  },
  "hits": {
    "total": {
      "value": 5,
      "relation": "eq"
    },
    "max_score": 0.49247646,
    "hits": [
      {
        "_index": "processing_records",
        "_id": "pxOyjpABD6f2xmKTvC-N",
        "_score": 0.49247646,
        "_source": {
          "processing_id": "1234",
          "file_name": "file1.xls",
          "start_processing": "06/07/2024",
          "status_processing": "processing"
        }
      },
      {
        "_index": "processing_records",
        "_id": "qROyjpABD6f2xmKTvC-N",
        "_score": 0.49247646,
        "_source": {
          "processing_id": "1235",
          "file_name": "file2.xls",
          "start_processing": "06/07/2024",
          "status_processing": "processing"
        }
      },
      {
        "_index": "processing_records",
        "_id": "qxOyjpABD6f2xmKTvC-N",
        "_score": 0.49247646,
        "_source": {
          "processing_id": "1236",
          "file_name": "file2.xls",
          "start_processing": "06/07/2024",
          "status_processing": "processing"
        }
      },
      {
        "_index": "processing_records",
        "_id": "rROyjpABD6f2xmKTvC-N",
        "_score": 0.49247646,
        "_source": {
          "processing_id": "1237",
          "file_name": "file4.xls",
          "start_processing": "06/07/2024",
          "status_processing": "processing"
        }
      },
      {
        "_index": "processing_records",
        "_id": "5NcenpABzpByMMxwvVDV",
        "_score": 0.49247646,
        "_source": {
          "processing_id": "1238",
          "file_name": "file5.xls",
          "start_processing": "10/07/2024",
          "status_processing": "processing"
        }
      }
    ]
  }
}

the only correct query was provided in another thread i opened which is something like this:

POST /processing_records/_search
{
  "size": 0,
  "aggs": {
    "processing_ids": {
      "terms": {
        "field": "processing_id",
        "size": 10000
      },
      "aggs": {
        "status_processing": {
          "terms": {
            "field": "status_processing"
          },
          "aggs": {
            "latest_record": {
              "top_hits": {
                "size": 1,
                "sort": [
                  {
                    "start_processing": {
                      "order": "desc"
                    }
                  }
                ]
              }
            }
          }
        },
        "processing_id_filter": {
          "bucket_selector": {
            "buckets_path": {
              "docCount": "_count"
            },
            "script": "params.docCount == 1"
          }
        }
      }
    }
  }
}

this is the result as expected, unfortunately this generates the issue i mentioned at the beggining of this thread which is caused by the use of aggregation

"error": {
    "root_cause": [
      {
        "type": "parsing_exception",
        "reason": "unknown query [aggs]",
        "line": 1,
        "col": 631
      }
    ],
    "type": "x_content_parse_exception",
    "reason": "[1:631] [bool] failed to parse field [filter]",
    "caused_by": {
      "type": "parsing_exception",
      "reason": "unknown query [aggs]",
      "line": 1,
      "col": 631,
      "caused_by": {
        "type": "named_object_not_found_exception",
        "reason": "[1:631] unknown field [aggs]"
      }
    }
  }
}

thanks again, maybe this should be reviewed as a new feature for the Kibana discovery.

carly.richmond · July 25, 2024, 10:46am

Thanks for the update @RobertBM! If you would like to raise a feature request for Discover I would recommend raising a feature request issue on the Kibana repo in GitHub.

Do let us know how you get on!

Topic		Replies	Views
Query DSL in kibana discover Kibana	3	500	October 14, 2022
Query DSL Issue Kibana	3	524	September 3, 2018
Aggregation quires in Query DSL of add filters in Discover Kibana	4	1496	March 4, 2020
Unable to run Query in Kibana DSL Elasticsearch	1	630	February 18, 2021
Create Rules in kibana-> unknown field [aggs] Elasticsearch painless , kql-kibana-query-language	2	368	December 7, 2023

Kibana Discover and DSL query - unknown field [aggs] and unknown field [size]

Related topics