DLQ not enabled for logstasha

I have two events: test_start and test_end

When I receive test_start event, I clone one more event from it test_attempt and index it in elasticsearch.
Then when test_end event arrives, I extract document id and index for associated test_attempt index and update that document with data generated from test_end event.

Below is the pipeline configuration:

41
42                 mutate {
43                         add_field => {
44                                 "testEndTime" => "%{event_time}"
45                         }
46                 }
47
48                 elasticsearch {
49                         hosts => ["172.31.2.218:9200", "172.31.2.208:9200", "172.31.1.29:9200"]
50                         index => "service-logs-*"
51                         query => "session.connId:1234 AND derived_event:TEST_ATTEMPT"
52                         result_size => 1
53                         docinfo_fields => {
54                                 "_id" => "[@metadata][doc_id]"
55                                 "_index" => "[@metadata][doc_index]"
56                         }
57                 }
58         }```

Output configuration

```# update existing document if derived event is TEST_ATTEMPT and session event is TEST_END
66         if [derived_event] == "TEST_ATTEMPT" and [session][event] == "TEST_END" {
67                 elasticsearch {
68                         hosts => ["172.31.2.218:9200", "172.31.2.208:9200", "172.31.1.29:9200"]
69                         index => "%{[@metadata][doc_index]}"
70                         document_id => "%{[@metadata][doc_id]}"
71                         action => "update"
72                 }
73         }else {
74                 # create new document in elasticserach
75                 elasticsearch {
76                         hosts => ["172.31.2.218:9200", "172.31.2.208:9200", "172.31.1.29:9200"]
77                         index => "service-logs-%{do_service}-%{+YYYY.MM.dd}"
78                         action => "create"
79                         data_stream => "false"
80                         ilm_enabled => "false"
81                 }
82         }```



Now when both events are triggered with a negligible gap between them, ES throws 404 as it  can not find the document.

So, I am trying to enable DLQ and then reprocess it after some delay.

But I am not able to turn on DLQ.
I have updated the config:
`dead_letter_queue.enable: true`


After turning on DLQ, I can see a  directory created for dead_letter_queue, but there are no files created inside, even after elasticsearch output plugin throws 404.

Also when I do `/_cat/indices`, I can see
%{[@metadata][doc_index]}

This gets created when the elasticsearch filter lookup fails. Output responds with 404, but this index gets created.

What am I doing wrong?

logstash version: 7.14

What exactly gets the 404 error? Is it the elasticsearch filter? If the elasticsearch output, is it the bullk request or one of the documents within it?

The output plugin throws 404 error. It basically complains that document does not exists.
When the lookup fails in elastic filter plugin, the metadata field is not pupulated with index name and doc id. And I think that should be the reason for throwing 404.

You need to distinguish whether the _bulk request is getting a 404 response or a document within it is getting a 404 response. The latter goes to the DLQ, the former does not.

It looks like a bulk request to me.

[WARN ] 2021-08-10 19:58:16.339 [[main]>worker0] elasticsearch - Could not index event to Elasticsearch. {:status=>404, :action=>["update", {:_id=>"%{[@metadata][doc_id]}", :_index=>"%{[@metadata][doc_index]}", :routing=>nil, :retry_on_conflict=>1}, {"message"=>"[connector] AMQP connecting.....", "event_time"=>"2021-08-10T01:25:07.125Z", "session"=>{"event"=>"TEST_END", "connId"=>"1234"}, "do_service"=>"ds-websocket", "derived_event"=>"TEST_ATTEMPT", "source_ip"=>"172.31.1.155", "level"=>"info", "type"=>"derived", "@timestamp"=>2021-08-10T19:57:54.222Z, "@version"=>"1", "testEndTime"=>"2021-08-10T01:25:07.125Z"}], :response=>{"update"=>{"_index"=>"%{[@metadata][doc_index]}", "_type"=>"_doc", "_id"=>"%{[@metadata][doc_id]}", "status"=>404, "error"=>{"type"=>"document_missing_exception", "reason"=>"[_doc][%{[@metadata][doc_id]}]: document missing", "index_uuid"=>"lzVTEX1SQQKOmjiiRe7mJA", "shard"=>"0", "index"=>"%{[@metadata][doc_index]}"}}}}

When I turn on debug log, I can see:

[DEBUG] 2021-08-10 20:08:54.686 [[main]>worker0] DefaultManagedHttpClientConnection - http-outgoing-3: set socket timeout to 60000
[DEBUG] 2021-08-10 20:08:54.686 [[main]>worker0] MainClientExec - Executing request POST /_bulk HTTP/1.1
[DEBUG] 2021-08-10 20:08:54.687 [[main]>worker0] MainClientExec - Target auth state: UNCHALLENGED
[DEBUG] 2021-08-10 20:08:54.687 [[main]>worker0] MainClientExec - Proxy auth state: UNCHALLENGED
[DEBUG] 2021-08-10 20:08:54.687 [[main]>worker0] headers - http-outgoing-3 >> POST /_bulk HTTP/1.1
[DEBUG] 2021-08-10 20:08:54.687 [[main]>worker0] headers - http-outgoing-3 >> Connection: Keep-Alive
[DEBUG] 2021-08-10 20:08:54.687 [[main]>worker0] headers - http-outgoing-3 >> Content-Type: application/json
[DEBUG] 2021-08-10 20:08:54.687 [[main]>worker0] headers - http-outgoing-3 >> Content-Length: 474
[DEBUG] 2021-08-10 20:08:54.688 [[main]>worker0] headers - http-outgoing-3 >> Host: 172.31.2.208:9200
[DEBUG] 2021-08-10 20:08:54.688 [[main]>worker0] headers - http-outgoing-3 >> User-Agent: Manticore 0.7.0
[DEBUG] 2021-08-10 20:08:54.688 [[main]>worker0] headers - http-outgoing-3 >> Accept-Encoding: gzip,deflate
[DEBUG] 2021-08-10 20:08:54.689 [[main]>worker0] wire - http-outgoing-3 >> "POST /_bulk HTTP/1.1[\r][\n]"
[DEBUG] 2021-08-10 20:08:54.690 [[main]>worker0] wire - http-outgoing-3 >> "Connection: Keep-Alive[\r][\n]"
[DEBUG] 2021-08-10 20:08:54.690 [[main]>worker0] wire - http-outgoing-3 >> "Content-Type: application/json[\r][\n]"
[DEBUG] 2021-08-10 20:08:54.690 [[main]>worker0] wire - http-outgoing-3 >> "Content-Length: 474[\r][\n]"
[DEBUG] 2021-08-10 20:08:54.690 [[main]>worker0] wire - http-outgoing-3 >> "Host: 172.31.2.208:9200[\r][\n]"
[DEBUG] 2021-08-10 20:08:54.690 [[main]>worker0] wire - http-outgoing-3 >> "User-Agent: Manticore 0.7.0[\r][\n]"
[DEBUG] 2021-08-10 20:08:54.691 [[main]>worker0] wire - http-outgoing-3 >> "Accept-Encoding: gzip,deflate[\r][\n]"
[DEBUG] 2021-08-10 20:08:54.691 [[main]>worker0] wire - http-outgoing-3 >> "[\r][\n]"
[DEBUG] 2021-08-10 20:08:54.691 [[main]>worker0] wire - http-outgoing-3 >> "{"update":{"_id":"%{[@metadata][doc_id]}","_index":"%{[@metadata][doc_index]}","routing":null,"retry_on_conflict":1}}[\n]"
[DEBUG] 2021-08-10 20:08:54.691 [[main]>worker0] wire - http-outgoing-3 >> "{"doc":{"message":"[connector] AMQP connecting.....","@version":"1","do_service":"ds-websocket","event_time":"2021-08-10T01:25:07.125Z","type":"derived","source_ip":"172.31.1.155","derived_event":"TEST_ATTEMPT","@timestamp":"2021-08-10T20:08:31.340Z","level":"info","testEndTime":"2021-08-10T01:25:07.125Z","session":{"event":"TEST_END","connId":"1234"}}}[\n]"
[DEBUG] 2021-08-10 20:08:54.701 [[main]>worker0] wire - http-outgoing-3 << "HTTP/1.1 200 OK[\r][\n]"
[DEBUG] 2021-08-10 20:08:54.702 [[main]>worker0] wire - http-outgoing-3 << "content-type: application/json; charset=UTF-8[\r][\n]"
[DEBUG] 2021-08-10 20:08:54.702 [[main]>worker0] wire - http-outgoing-3 << "content-encoding: gzip[\r][\n]"
[DEBUG] 2021-08-10 20:08:54.702 [[main]>worker0] wire - http-outgoing-3 << "content-length: 223[\r][\n]"
[DEBUG] 2021-08-10 20:08:54.702 [[main]>worker0] wire - http-outgoing-3 << "[\r][\n]"

How to handle failed requests in case of _bulk API?

That is here in the code. It is a response for a document within the bulk request, and it is a response that is sent for DLQ processing.

However, looking at that processing, if the DLQ is enabled then it is written to it, otherwise it logs the message that you see logged. logstash does not think you have DLQ enabled.

If the call to _bulk fails it is retried forever. This was recently changed to exclude status 413 (content too large) since that will continue failing forever, so it effectively stops logstash processing anything. There are other cases where the retried call will fail forever, but I cannot think of them right now.

Thanks for pointing out the code blocks. I will dig into it.

But why I see %{[@metadata][doc_index]} index created when I use /_cat/indices.
It is an empty index though, but if the variable is empty, shouldn't it just evaluate to null rather than the literal string?

No, the literal string is expected.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.