Rollup - no data


(Trondhindenes) #1

Hi,
I wanted to experiment with the new Rollup api, so I loaded a small chunk of data over to a dev cluster and configured a rollup job. The job stats object shows this:

"stats": {
  "pages_processed": 0,
  "documents_processed": 0,
  "rollups_indexed": 0,
  "trigger_count": 12
}

I've configured the cron expression to make it run every minute, but it's always just stuck in the started state (not indexing) and pages processed remain at 0.

Any pointers appreciated!


(Christian Dahlqvist) #2

How did you configure the rollup job? What does your data look like?


(Trondhindenes) #3

Ah, I see some exceptions regarding "data too large" in the Elasticsearch job. I guess I was expecting that kind of info to be surfaced into the job response somehow. Will dig into it.


(Trondhindenes) #4

I'm getting a circuit breaker exception, which seems to be the root cause of why my rollups arent generating. Funny thing is, whatever I set the heap size to, the exception triggers on slightly data too large exceptions:

Originally max mem was set to the default of 1GB, which triggered the circuit breaker at around 750MB. Now when upping to 4GB, I'm getting this:

Caused by: org.elasticsearch.common.breaker.CircuitBreakingException: [request] Data too large, data for [<reused_arrays>] would be [3200119680/2.9gb], which is larger than the limit of [2495663308/2.3gb]

Here's my rollup job:

{
    "index_pattern": "webserverlogs-prod-*",
    "rollup_index": "webserverlogs-prod_rollup",
    "cron": "* * * * * ?",
    "page_size" :10,
    "groups" : {
      "date_histogram": {
        "field": "@timestamp",
        "interval": "10m",
        "delay": "5m"
      },
      "terms": {
        "fields": ["request_host", "request_method", "upstream_statuscode"]
      },
      "histogram": {
        "fields": ["timetaken"],
        "interval": 100
      }
    },
    "metrics": [
        {
            "field": "timetaken",
            "metrics": ["min", "max", "avg", "value_count"]
        }
    ]
}

The "raw" webserverlogs index currently holds 117MB of data.

EDIT: Looks like the root cause was the fact that the rollup index would be hit by the mapping template of the raw index. As far as I can see, rollups expect to be able to create their own mappings. Maybe something to clarify in the documentation.


(system) #5

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.