Rollup - Store statistics (just a number)

vdelcampo · May 12, 2020, 11:13am

Hi.

I need to store historical data from my indexes, exactly the number of request per minute that my webservices has along time. Its possible to store only this information? If i create a rollup job using only the fileds i need to extract this information, the index size becomes so big depending on the days that the job process. I only want numerical statistics. Its possible to do that?

Thanks in advance

Víctor

Hendrik_Muhs · May 12, 2020, 12:58pm

Rollup stores extra meta information in order to provide rollup search, if you do not need rollup search, transform might be an option for you, it only stores what you are asking for and in case you want to further compress, you can tweak the mappings to use smaller data types.

Whether rollup or transform, the reduction should mainly depend on the bucket size you choose, in your case the date histogram interval.

vdelcampo · May 13, 2020, 9:40am

Hi Hendrik.

Transform is what i need! I created a transform to extract only the information i need, but when i go to Discover and use the new index i created, theres no option to filter by time. What can i do to filter by time?

Kind regards

Víctor

Hendrik_Muhs · May 13, 2020, 9:59am

Hi,

that's a current limitation, see this issue. The workaround is to either create the index pattern yourself, not using the transform wizard or you delete the already existing index pattern and create a new one. The limitation in the management UI has its own issue (contains a 3rd option: manually update the index pattern).

Hope that fixes it.

vdelcampo · May 13, 2020, 11:36am

Got it!

Thanks a lot Hendrik

vdelcampo · May 14, 2020, 7:25am

Hi Hendrik.

One more question. If i want to filter by time in Discover, Its required in the transform to group index data by Timestamp? Is there another way to do it?

Regards

Captura

Hendrik_Muhs · May 14, 2020, 8:28am

I think in the group_by it makes the most sense. You can of course have time fields in aggregations, too. E.g. a last_updated field. Still if you do not group_by time, it will not result in a time series.

I am not sure I am getting your question, can you explain what you want to do?

vdelcampo · May 14, 2020, 9:02am

I´m sending all traffic from my F5 load balancer to Elastic. At the moment just for 5 services, wich are indentified by the virtual_ip field. For each service, there are to many diferent requests, wich i differentiate by the http_path field. All i want to do is a transform to store in an index, total request for an specific virtual_server, total requests for an specific http_path of that virtual_server, average response time and also if its possible, status of each request. Heres my transform code:

POST _transform/_preview
{
 "source": {
"index": "my_index",
"query": {
  "bool": {
"must": [
  {
    "match": {
      "virtual_ip": "x.x.x.x"
    }
  }
],
"filter": [
  {
    "term": {
      "http_path.keyword": "my_path/my_file.aspx"
    }
  },
  {
    "range": {
      "@timestamp": {
        "time_zone": "+02:00",
        "gte": "2020-05-07T00:00:00",
            "lte": "2020-05-08T00:00:00"
              }
            }
          }
        ]
}
}
 },
 "dest": {
   "index": "my_dest_index"
},
"pivot": {
"group_by": {
  "status.keyword": {
    "terms": {
      "field": "status.keyword"
    }
  }
},
"aggregations": {
  "response_msecs.avg": {
    "avg": {
      "field": "response_msecs"
    }
  },
  "count": { "value_count": { "field": "@timestamp" }}
 }
 }
}

I dont want to use time range because that means i will have to create an index for each day for example.

Regards

Hendrik_Muhs · May 14, 2020, 9:35am

Thanks, always easier to work with examples.

Adding a date_histogram in group_by should work, however you said you want "another way". What's the problem with something like this:

"group_by": {
  "day_bucket": {
    "date_histogram": {
      "field" : "@timestamp",
      "calendar_interval" : "1d"
    }
  },
  "status.keyword": {
    "terms": {
      "field": "status.keyword"
    }
  }
}

vdelcampo · May 14, 2020, 12:48pm

That is what i was looking for! Thank you very much Hendrik!

Kind Regards

Víctor

system · June 11, 2020, 12:49pm

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Aggregating unique (timestamp) values Elasticsearch	1	200	June 7, 2023
Data Aggregation (timeseries) Elasticsearch	3	334	July 23, 2018
Granulated data - elasticsearch. Is it possible to convert minutely data to hourly data and store it as a new index? Elasticsearch	2	529	August 31, 2017
Storing aggregation in elasticsearch Elasticsearch	2	575	March 4, 2020
Rollup strategy in Elastic Elasticsearch	1	796	December 11, 2017

Rollup - Store statistics (just a number)

Related topics