Script for generating index template and ascii doc

There are scripts living in topbeat that generates index template and ascii doc. Though, they are not supporting nested data type yet. Is nested data type a good practice to use in ES? If yes, shall they be supported in those script?

I patched the scripts in nginxbeat 1, 2, shall topbeat apply the same set of changes even though it does not require nested data type? I think that can help other beat developers to get started more easily.

From what I know, Kibana doesn't support "nested" query. There is an open PR for that: https://github.com/elastic/kibana/pull/4806. As we would like to use Kibana, nested data type was not an option for the Beats.

I see that the generated template for nginxbeat is using nested data type. I am curious to know if you managed to find a way to visualize the data with Kibana or do you use something else for visualization?

In the future, I am thinking to have a common script that generates the template for each Beat, with extended functionality. as I am finding annoying to copy and adjust the script when a new Beat is created. I think it's a good idea if this script supports nested as well :smile:

I think it is aggregations in kibana not working with nested data, but filtering should work. Right?

I didn't try visualize the nested data type yet :stuck_out_tongue: sorry i cannot give any input here. Though, nginx status report needs that or otherwise i have to use dynamic data type (which is discouraged, i believe).

@steffens Correct. Nested aggregations don't work. Filtering / querying is not a problem.

So, I think I better emit multiple events of a type (e.g. upstream) rather than using nested data type to hold them?

@mrkschan can you make an example?

See below the original dynamic data type and a transformed nested data type in an array.

# Original dynamic data type
        "server_zones": {
          "hg.nginx.org": {
            "discarded": 1732,
            "processing": 0,
            "received": 9404252,
            "requests": 33082,
            "responses": {
                "1xx": 0,
                "2xx": 29893,
                "3xx": 857,
                "4xx": 453,
                "5xx": 147,
                "total": 31350
            },
            "sent": 943707757
          },
          "trac.nginx.org": {
            "discarded": 396,
            "processing": 1,
            "received": 16824065,
            "requests": 47339,
            "responses": {
                "1xx": 0,
                "2xx": 22694,
                "3xx": 21060,
                "4xx": 3061,
                "5xx": 127,
                "total": 46942
            },
            "sent": 771492649
          },
          "lxr.nginx.org": {
            "discarded": 112,
            "processing": 0,
            "received": 888931,
            "requests": 3684,
            "responses": {
                "1xx": 0,
                "2xx": 3361,
                "3xx": 125,
                "4xx": 80,
                "5xx": 6,
                "total": 3572
            },
            "sent": 82231555
          }
        },

# Transformed nested data type
        "server_zones": [{
            "name": "hg.nginx.org",
            "discarded": 1732,
            "processing": 0,
            "received": 9404252,
            "requests": 33082,
            "responses": {
                "1xx": 0,
                "2xx": 29893,
                "3xx": 857,
                "4xx": 453,
                "5xx": 147,
                "total": 31350
            },
            "sent": 943707757
        }, {
            "name": "trac.nginx.org",
            "discarded": 396,
            "processing": 1,
            "received": 16824065,
            "requests": 47339,
            "responses": {
                "1xx": 0,
                "2xx": 22694,
                "3xx": 21060,
                "4xx": 3061,
                "5xx": 127,
                "total": 46942
            },
            "sent": 771492649
        }, {
            "name": "lxr.nginx.org",
            "discarded": 112,
            "processing": 0,
            "received": 888931,
            "requests": 3684,
            "responses": {
                "1xx": 0,
                "2xx": 3361,
                "3xx": 125,
                "4xx": 80,
                "5xx": 6,
                "total": 3572
            },
            "sent": 82231555
        }],

@mrkschan In your case I would probably go for option 2 as this would allow for more fexibility as the events are sent independent. Like you could send data for a busy host more frequent etc.

Right, I think it's better to avoid arrays and create multiple smaller documents. We did the same for Topbeat.