Zombie Search Tasks on 5.4.0

Hi there,
I have a situation were when searches are done or timedout and the cluster gathers search tasks which are visible by /_cat/tasks | grep search.
When i see the details via /_task?actions=*search&detailed i see they

  • dont have parent task id (expected since the origin http search requests is timedout)
  • type : transport
  • indices:data/read/search
    These pile up and lead to excessive search timeouts. Bouncing the node fixes the issue.
    I am not able to pin down if these zombie task are causing search timeouts or initializing shards due to cluster instability are causing increasing search timeouts.
    I am expected the zombie tasks should not pile up at-least.

In logs when i tried trace i do see a possible co-relation (could be red herring) that the tasks are targeting indexes which be recovered or de-allocated from the node on which the task is left behind.

I see messages like
[2018-04-25T19:18:53,013][TRACE][o.e.t.TaskManager ] [host3-1] register 106886 [transport] [indices:data/read/search] [indices[xx-2018.04.09], types[], search_type[QUERY_THEN_FETCH], source[{"size":0,"query":{"range":{"@timestamp":{"from":"2018-03-25T05:13:09.405432","to":"2018-04-24T05:13:09.405432","include_lower":true,"include_upper":false,"time_zone":"UTC","boost":1.0}}},"sort":[{"@timestamp":{"order":"desc","unmapped_type":"date"}}]}]]
&
NEITHER unregister task for id:
on the node where the zombie task is piled up for the zombie task IDs but still the tasks show up on task management API

Any possible pointers ? possible bug ?

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.