Script for generating index template and ascii doc


(Ks Chan) #1

There are scripts living in topbeat that generates index template and ascii doc. Though, they are not supporting nested data type yet. Is nested data type a good practice to use in ES? If yes, shall they be supported in those script?

I patched the scripts in nginxbeat 1, 2, shall topbeat apply the same set of changes even though it does not require nested data type? I think that can help other beat developers to get started more easily.


(Monica Sarbu) #2

From what I know, Kibana doesn't support "nested" query. There is an open PR for that: https://github.com/elastic/kibana/pull/4806. As we would like to use Kibana, nested data type was not an option for the Beats.

I see that the generated template for nginxbeat is using nested data type. I am curious to know if you managed to find a way to visualize the data with Kibana or do you use something else for visualization?

In the future, I am thinking to have a common script that generates the template for each Beat, with extended functionality. as I am finding annoying to copy and adjust the script when a new Beat is created. I think it's a good idea if this script supports nested as well :smile:


(Steffen Siering) #3

I think it is aggregations in kibana not working with nested data, but filtering should work. Right?


(Ks Chan) #4

I didn't try visualize the nested data type yet :stuck_out_tongue: sorry i cannot give any input here. Though, nginx status report needs that or otherwise i have to use dynamic data type (which is discouraged, i believe).


(ruflin) #5

@steffens Correct. Nested aggregations don't work. Filtering / querying is not a problem.


(Ks Chan) #6

So, I think I better emit multiple events of a type (e.g. upstream) rather than using nested data type to hold them?


(ruflin) #7

@mrkschan can you make an example?


(Ks Chan) #8

See below the original dynamic data type and a transformed nested data type in an array.

# Original dynamic data type
        "server_zones": {
          "hg.nginx.org": {
            "discarded": 1732,
            "processing": 0,
            "received": 9404252,
            "requests": 33082,
            "responses": {
                "1xx": 0,
                "2xx": 29893,
                "3xx": 857,
                "4xx": 453,
                "5xx": 147,
                "total": 31350
            },
            "sent": 943707757
          },
          "trac.nginx.org": {
            "discarded": 396,
            "processing": 1,
            "received": 16824065,
            "requests": 47339,
            "responses": {
                "1xx": 0,
                "2xx": 22694,
                "3xx": 21060,
                "4xx": 3061,
                "5xx": 127,
                "total": 46942
            },
            "sent": 771492649
          },
          "lxr.nginx.org": {
            "discarded": 112,
            "processing": 0,
            "received": 888931,
            "requests": 3684,
            "responses": {
                "1xx": 0,
                "2xx": 3361,
                "3xx": 125,
                "4xx": 80,
                "5xx": 6,
                "total": 3572
            },
            "sent": 82231555
          }
        },

# Transformed nested data type
        "server_zones": [{
            "name": "hg.nginx.org",
            "discarded": 1732,
            "processing": 0,
            "received": 9404252,
            "requests": 33082,
            "responses": {
                "1xx": 0,
                "2xx": 29893,
                "3xx": 857,
                "4xx": 453,
                "5xx": 147,
                "total": 31350
            },
            "sent": 943707757
        }, {
            "name": "trac.nginx.org",
            "discarded": 396,
            "processing": 1,
            "received": 16824065,
            "requests": 47339,
            "responses": {
                "1xx": 0,
                "2xx": 22694,
                "3xx": 21060,
                "4xx": 3061,
                "5xx": 127,
                "total": 46942
            },
            "sent": 771492649
        }, {
            "name": "lxr.nginx.org",
            "discarded": 112,
            "processing": 0,
            "received": 888931,
            "requests": 3684,
            "responses": {
                "1xx": 0,
                "2xx": 3361,
                "3xx": 125,
                "4xx": 80,
                "5xx": 6,
                "total": 3572
            },
            "sent": 82231555
        }],

(ruflin) #9

@mrkschan In your case I would probably go for option 2 as this would allow for more fexibility as the events are sent independent. Like you could send data for a busy host more frequent etc.


(Tudor Golubenco) #10

Right, I think it's better to avoid arrays and create multiple smaller documents. We did the same for Topbeat.


(system) #11