Scheduling the generation of a visualisation and saving/caching the result or automated CSV export

Hello,

We are currently using Elastic table visualisations to obtain a record of network connections from a certain group of machines, an example:

We are manually generating these on a machine-by-machine basis and exporting the result to CSV.

The issue we are having is we require quite a long date range (30 days) and in the case of certain machines this can take up to an hour for the table to generate, it may well be I am using an inefficient method to aggregate this here is an example of the Request data on a 30 minute view:

{
  "aggs": {
    "0": {
      "terms": {
        "field": "winlog.event_data.DestPort",
        "order": {
          "_count": "desc"
        },
        "size": 50
      },
      "aggs": {
        "1": {
          "terms": {
            "field": "winlog.event_data.Protocol",
            "order": {
              "_count": "desc"
            },
            "size": 50
          },
          "aggs": {
            "2": {
              "terms": {
                "field": "winlog.event_data.SourceAddress",
                "order": {
                  "_count": "desc"
                },
                "size": 100
              },
              "aggs": {
                "3": {
                  "terms": {
                    "field": "winlog.event_data.DestAddress",
                    "order": {
                      "_count": "desc"
                    },
                    "size": 10
                  },
                  "aggs": {
                    "4": {
                      "terms": {
                        "field": "winlog.event_data.Application",
                        "order": {
                          "_count": "desc"
                        },
                        "size": 1000
                      },
                      "aggs": {
                        "5": {
                          "date_histogram": {
                            "field": "@timestamp",
                            "calendar_interval": "1h",
                            "time_zone": "Europe/London",
                            "extended_bounds": {
                              "min": 1663922333593,
                              "max": 1663924133593
                            }
                          }
                        }
                      }
                    }
                  }
                }
              }
            }
          }
        }
      }
    }
  },
  "size": 0,
  "fields": [
    {
      "field": "@timestamp",
      "format": "date_time"
    },
    {
      "field": "client.process.start",
      "format": "date_time"
    },
    {
      "field": "event.end",
      "format": "date_time"
    },
    {
      "field": "event.ingested",
      "format": "date_time"
    },
    {
      "field": "event.start",
      "format": "date_time"
    },
    {
      "field": "file.accessed",
      "format": "date_time"
    },
    {
      "field": "file.created",
      "format": "date_time"
    },
    {
      "field": "file.ctime",
      "format": "date_time"
    },
    {
      "field": "file.mtime",
      "format": "date_time"
    },
    {
      "field": "file.x509.not_after",
      "format": "date_time"
    },
    {
      "field": "file.x509.not_before",
      "format": "date_time"
    },
    {
      "field": "package.installed",
      "format": "date_time"
    },
    {
      "field": "process.parent.start",
      "format": "date_time"
    },
    {
      "field": "process.start",
      "format": "date_time"
    },
    {
      "field": "rsa.internal.lc_ctime",
      "format": "date_time"
    },
    {
      "field": "rsa.internal.time",
      "format": "date_time"
    },
    {
      "field": "rsa.time.effective_time",
      "format": "date_time"
    },
    {
      "field": "rsa.time.endtime",
      "format": "date_time"
    },
    {
      "field": "rsa.time.event_queue_time",
      "format": "date_time"
    },
    {
      "field": "rsa.time.event_time",
      "format": "date_time"
    },
    {
      "field": "rsa.time.expire_time",
      "format": "date_time"
    },
    {
      "field": "rsa.time.recorded_time",
      "format": "date_time"
    },
    {
      "field": "rsa.time.stamp",
      "format": "date_time"
    },
    {
      "field": "rsa.time.starttime",
      "format": "date_time"
    },
    {
      "field": "server.process.start",
      "format": "date_time"
    },
    {
      "field": "tls.detailed.client_certificate.not_after",
      "format": "date_time"
    },
    {
      "field": "tls.detailed.client_certificate.not_before",
      "format": "date_time"
    },
    {
      "field": "tls.detailed.server_certificate.not_after",
      "format": "date_time"
    },
    {
      "field": "tls.detailed.server_certificate.not_before",
      "format": "date_time"
    }
  ],
  "script_fields": {},
  "stored_fields": [
    "*"
  ],
  "runtime_mappings": {},
  "_source": {
    "excludes": []
  },
  "query": {
    "bool": {
      "must": [],
      "filter": [
        {
          "match_phrase": {
            "agent.name": "HOST-NAME"
          }
        },
        {
          "match_phrase": {
            "winlog.event_data.Direction": "%%14592"
          }
        },
        {
          "range": {
            "@timestamp": {
              "format": "strict_date_optional_time",
              "gte": "2022-09-23T08:38:53.593Z",
              "lte": "2022-09-23T09:08:53.593Z"
            }
          }
        }
      ],
      "should": [],
      "must_not": []
    }
  }
}

The idea was floated of triggering the generation of around 5 machines a night and having the CSVs emailed, out but on investigation there does not appear to be a mechanism to do this is this correct?

If this is not possible is there a way to schedule the generation of these tables and save/cache the result for us to then go in and manually export the CSVs? I have had a play around with this, with my limited knowledge, but everything I've tried seems to just save the query and the table will attempt to generate from fresh.

Thanks.

Hello @Josh_G !

The idea was floated of triggering the generation of around 5 machines a night and having the CSVs emailed, out but on investigation there does not appear to be a mechanism to do this is this correct?

The only supported way of automating generating and sending an email containing reports in Kibana today is to use watcher:

Of course, you could also copy the post URL for the CSV and run a cron that will trigger the jobs, but it will not email them.

It is possible winlog.event_data.Application is a high cardinality field (I see you have size 1000) which might be negatively affecting performance, I'm not sure what the fix here is if you need your data bucketed in this way, but know that it may be causing a bottleneck.

Another thing to consider for your search use case is sharding which may require some experimentation to optimize, Size your shards | Elasticsearch Guide [8.4] | Elastic