Aggregations query timeout and cancellation

Hi guys,

Setting a timeout property on a search query seems to not have any effect. I know that queries timeouts are not very precise due to the problem of time out checks during execution, but the query below successfully finishes after ~1 min of execution despite having timeout set to 5 seconds.

What's also interesting, cancellation (through Task API) also doesn't have any effect - I have a test script that runs the query below, waits for 5 seconds and then cancels the query, but it has the same result as with the timeout property - query successfully finishes in ~1min. Is this an edge case for aggregation queries? Would appreciate any information about this behaviour.

{
  "timeout": "5s",
  "size": 0,
  "aggs": {
    "context": {
      "aggs": {
        "metric": {
          "aggs": {
            "metric": {
              "cardinality": {
                "field": "some_id",
                "precision_threshold": 20000
              }
            }
          },
          "date_histogram": {
            "extended_bounds": {
              "max": "2018-12-31T23:59:59-02:00",
              "min": "2018-08-01T00:00:00-02:00"
            },
            "field": "entity_created_at",
            "interval": "day",
            "min_doc_count": 0,
            "time_zone": "America/Sao_Paulo"
          }
        },
        "unique_count": {
          "cardinality": {
            "field": "some_id",
            "precision_threshold": 20000
          },
          "meta": {
            "unique_count": "some_id"
          }
        }
      },
      "terms": {
        "field": "some_tag_ids",
        "size": 10000
      }
    },
    "value": {
      "cardinality": {
        "field": "some_id",
        "precision_threshold": 20000
      }
    }
  },
  "from": 0,
  "query": {
    "bool": {
      "filter": [
        {
          "bool": {
            "minimum_should_match": 1,
            "should": [
              {
                "term": {
                  "entity_id": 13
                }
              },
              {
                "term": {
                  "entity_id": 14
                }
              }
            ]
          }
        },
        {
          "range": {
            "subentity_created_at": {
              "lte": "2018-12-31T23:59:59-02:00",
              "gte": "2018-08-01T00:00:00-02:00",
              "time_zone": "America/Sao_Paulo"
            }
          }
        },
        {
          "terms": {
            "some_tag_ids": [
              101,
              102,
              103,
              104,
              105,
              106,
              107,
              108,
              109,
              110
            ]
          }
        }
      ]
    }
  },
  "sort": {}
}

I may be wrong on this but I recall that some aspects of a query are tight loops that are deliberately not interrupted because they are building data structures that are of use to all queries, not just the current one. The global ordinals for example need to be re-built after a refresh and if a query starts that process I think it may not be timed out because the resulting cache will benefit other queries. That certainly used to be the policy when loading the field data cache.

If global ordinals are the issue (see example ) then it might be worth avoiding them through the use of "execution_hint":"map" if your query matches modest numbers of terms.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.