Rest High Level Client : Request timeout is not working

We are trying to use a request timeout in our queries but it doesn't seem to be working for us.

Here're the things we have done as part of setup:

  • search.default_allow_partial_results : false (on server side as well as client side)

  • Set the timeout of 10ms in every search query that we hit. (client side)

  • Apart from these, we have global timeouts set as shown in the code below:

    RestHighLevelClient client = new RestHighLevelClient(RestClient.builder(httpHost).setRequestConfigCallback(
    requestConfigBuilder -> requestConfigBuilder
    .setConnectTimeout(30000)
    .setConnectionRequestTimeout(90000)
    .setSocketTimeout(90000)).setMaxRetryTimeoutMillis(90000));

Queries are taking more than 8 seconds but still not getting timed out. We have disabled the partial results as an expectation to get a timeout error but we don't get any error as well.

Also, the isTimedOut flag is returned as false always even though query took more than the specified timeout.

Here's a sample of request that I'm querying:

SearchSourceBuilder sourceBuilder = new SearchSourceBuilder();

    BoolQueryBuilder boolQueryBuilder = QueryBuilders.boolQuery();
    QueryBuilder infraQueryBuilder = QueryBuilders.termQuery("field1", field1);
    QueryBuilder totalCountRangeQueryBuilder = QueryBuilders.rangeQuery("field2").gte(3);

    BoolQueryBuilder innerBoolQueryBuilder = QueryBuilders.boolQuery();
    innerBoolQueryBuilder.must(QueryBuilders.rangeQuery("nestedDocType1.field1").gt(2));

    QueryBuilder filter = QueryBuilders
        .nestedQuery("nestedDocType1", innerBoolQueryBuilder, ScoreMode.Max)
        .innerHit(new InnerHitBuilder()
            .setFetchSourceContext(
                new FetchSourceContext(true, new String[]{"nestedDocType1.field1"}, null))
            .addSort(SortBuilders.fieldSort("nestedDocType1.field1").order(SortOrder.DESC))
            .setSize(1)
        );

    boolQueryBuilder.must(infraQueryBuilder);
    boolQueryBuilder.must(totalCountRangeQueryBuilder);

    if (inputRevisions != null && (inputRevisions.size() > 0)) {
      QueryBuilder allEligibleRevisionsFilter = QueryBuilders
          .termsQuery("field3", inputRevisions);
      boolQueryBuilder.must(allEligibleRevisionsFilter);
    }

    boolQueryBuilder.filter(filter);

    sourceBuilder.query(boolQueryBuilder)
        .fetchSource(new String[]{
            "field3",
            "field2"
        }, null);

    sourceBuilder.size(batchSize);
    sourceBuilder.timeout(TimeValue.timeValueMillis(10));

    SearchRequest searchRequest = createSearchRequest(sourceBuilder, enterpriseId);
    searchRequest.allowPartialSearchResults(false);

    SearchResponse searchResponse = getSearchResponse(searchRequest);

    ESCustomScroll<Set<String>> esCustomScroll = this::populateProcessedRevisionsSetWithESScroll;
    getESDataByScroll(esCustomScroll, searchResponse, processedRevisions);  // gets the data by scrolling over again and again until data is available.

Here's the code that we use for scrolling:

private boolean populateProcessedRevisionsSetWithESScroll(SearchResponse searchResponse, Set<String> processedRevisions) {
            if(searchResponse == null ||
                searchResponse.getHits() == null ||
                searchResponse.getHits().getHits() == null ||
                searchResponse.getHits().getHits().length == 0) {
              return false;
            }

            for(SearchHit outerHit : searchResponse.getHits().getHits()) {
              Map<String, Object> outerSourceMap = outerHit.getSourceAsMap();

              String revision = (String) outerSourceMap.get("field4");
              int totalCount = (Integer) outerSourceMap.get("field3");

              SearchHit[] innerHits = outerHit.getInnerHits().get("nestedDocType1").getHits();

              if(innerHits == null || innerHits.length == 0) {
                logger.error("No inner hits found for revision: "+revision);
                continue;
              }

              Map<String, Object> innerSourceMap = innerHits[0].getSourceAsMap();
              int simCount = (Integer) innerSourceMap.get("field1");

              if(((totalCount - simCount) == 0) || (simCount > ((totalCount - simCount) / 2))) {
                processedRevisions.add(revision);
              }
            }

            return true;
          }

Even in case of partial results, we expect the isTimedOut flag to be set. But that's not the case.

Can you please guide us where are we wrong or what are we missing?

Related question: Java High Level Rest Client is not releasing connection although timeout is set

Hi @KULDEEP_SINGH,

I think the issue is the combination of scroll and timeout. This looks unsupported and as far as I can see, timeout is disabled for scroll queries.

If you are on 7.5+, you should be able to use searchAsync to do the timeout in the client instead. See: https://github.com/elastic/elasticsearch/pull/43332 and https://github.com/elastic/elasticsearch/pull/45379. I did not try this out in the high level rest client, but AFACS it should work.

Also notice that search timeout can be confusing in that it works per shard (which I think you are aware of).

I am not entirely sure I understand what you want to achieve, maybe you can elaborate on that?

Hi @HenningAndersen,

I'm sorry about the question not being clear to you.

What we want to achieve is some kind of error thrown by elastic search once the query running on elastic search exceeds the defined threshold.

In my question's description, the request timeout of 10ms is that threshold.

In the scenario I explained earlier, the query took about 8 seconds and the request timeout configured was of 10ms. Even if it works per shard then also it's unlikely that no shard timed out.

We are using elastic search version : 6.5.4
Same is the version for rest client as well.

Is there any way we can interrupt the current query execution on client side, if it overall takes more than, say, 10ms?

  • We don't want partial results as well.
  • We expect either an error in that case or at least some flag which indicates that the query overran the specified threshold.

Hi @KULDEEP_SINGH,

thanks for elaborating. I think that part is clear. It is more the combination of wanting a 10ms timeout and using scroll that I find confusing. It seems unlikely to be able to do the second scroll within 10ms anyway. Do you expect the full search to timeout in 10ms or do you expect it to be per request? I am curious about what you expect to get out of doing search and scroll (if that is what you do, I do not have enough of the code to verify) with such a short timeout?

Can you try without the scroll part (I assume you set this in createSearchRequest)? Just to verify that this makes search timeout work.

Hi @HenningAndersen,

I was testing on a small part of our code, where I set the timeout to be 10ms. On production, it's going to be around 90-120 mins. However as per our analysis till now, none of our queries took more than 30 mins on production. If you think setting a higher timeout might work then we can try that well, but then what will it be approx? Will it be in seconds or minutes?

=> Do you expect the full search to timeout in 10ms or do you expect it to be per request?
-- We want the complete search to timeout. We don't really need a scroll timeout. As you suggested, we can try it without the scroll timeout as well and see if it works.

Please let us know if this answers your questions, or anything else you want us to provide.

Thanks

Hi @KULDEEP_SINGH,

thanks for elaborating on this.

The basic problem is that scroll and timeout together is not supported. I think that if you test without scroll, it should work with the 10ms timeout.

You wrote:

We don't really need a scroll timeout

not sure if you mean you do not need to scroll through these queries? If you do not need scroll, please try without it.

If you need to scroll with timeout (that is, see all results of the query, not only the first X of them), there are a few different options:

  1. If the underlying data does not change while querying, you could manually partition the search instead into smaller lumps.
  2. Upgrade to 7.5 and use searchAsync instead. Add client side timeout of the query by cancelling the search request after the timeout period.
  3. On 6.5, you can similarly use the tasks API to cancel, but this is quite involved.
  4. There may be other ways that require more knowledge on the use case to come up with.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.