Rollup - no data

trondhindenes · June 17, 2018, 7:35am

Hi,
I wanted to experiment with the new Rollup api, so I loaded a small chunk of data over to a dev cluster and configured a rollup job. The job stats object shows this:

"stats": {
  "pages_processed": 0,
  "documents_processed": 0,
  "rollups_indexed": 0,
  "trigger_count": 12
}

I've configured the cron expression to make it run every minute, but it's always just stuck in the started state (not indexing) and pages processed remain at 0.

Any pointers appreciated!

Christian_Dahlqvist · June 17, 2018, 7:41am

How did you configure the rollup job? What does your data look like?

trondhindenes · June 17, 2018, 7:47am

Ah, I see some exceptions regarding "data too large" in the Elasticsearch job. I guess I was expecting that kind of info to be surfaced into the job response somehow. Will dig into it.

trondhindenes · June 17, 2018, 8:10am

I'm getting a circuit breaker exception, which seems to be the root cause of why my rollups arent generating. Funny thing is, whatever I set the heap size to, the exception triggers on slightly data too large exceptions:

Originally max mem was set to the default of 1GB, which triggered the circuit breaker at around 750MB. Now when upping to 4GB, I'm getting this:

Caused by: org.elasticsearch.common.breaker.CircuitBreakingException: [request] Data too large, data for [<reused_arrays>] would be [3200119680/2.9gb], which is larger than the limit of [2495663308/2.3gb]

Here's my rollup job:

{
    "index_pattern": "webserverlogs-prod-*",
    "rollup_index": "webserverlogs-prod_rollup",
    "cron": "* * * * * ?",
    "page_size" :10,
    "groups" : {
      "date_histogram": {
        "field": "@timestamp",
        "interval": "10m",
        "delay": "5m"
      },
      "terms": {
        "fields": ["request_host", "request_method", "upstream_statuscode"]
      },
      "histogram": {
        "fields": ["timetaken"],
        "interval": 100
      }
    },
    "metrics": [
        {
            "field": "timetaken",
            "metrics": ["min", "max", "avg", "value_count"]
        }
    ]
}

The "raw" webserverlogs index currently holds 117MB of data.

EDIT: Looks like the root cause was the fact that the rollup index would be hit by the mapping template of the raw index. As far as I can see, rollups expect to be able to create their own mappings. Maybe something to clarify in the documentation.

system · July 15, 2018, 8:10am

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Rollup Jobs Performance Elasticsearch	1	296	September 14, 2020
Roll up Jobs Performance Elasticsearch	1	311	November 27, 2020
Nothing happend after defining a rollup index Elasticsearch	2	375	October 17, 2019
Rollup Jobs Performance Elasticsearch rollups	6	868	January 20, 2023
Rollup Jobs losing data Elasticsearch rollups	2	365	December 28, 2022

Rollup - no data

Related topics