Document_id not set in ElasticSearch, even though it is configured inside logstash.conf

Here are the contents inside my logstash.conf:

input {
	http {
		host => "127.0.0.1"
		port => 31311 
	}
}

filter {
	mutate {
		split => { "[headers][request_path]" => "/"}
		add_field => { "index_id" => "%{[headers][request_path][1]}" }
	}
	
	ruby { 
		code => "event.set('request_path_length', event.get('[headers][request_path]').length)" 
	}
	
	if [request_path_length == 3] {
		mutate {
			add_field => { "document_id" => "%{[headers][request_path][2]}" }
		}
	}
}
	
output {
	stdout {
		codec => "rubydebug"
	}
  
	if [request_path_length == 3] {
		elasticsearch {
			hosts => "http://localhost:9200"
			index => "%{index_id}"
			document_id => "%{[headers][request_path][2]}"
		}
	}
	else {
		elasticsearch {
			hosts => "http://localhost:9200"
			index => "%{index_id}"
		}
	}
}

As a test, I ran the PowerShell command

C:\Users\BolverkXR\Downloads\curl-7.64.1-win64-mingw\bin> .\curl.exe -XPUT 'http://127.0.0.1:31311/twitter_new/7'

I see the following output on my Logstash terminal:

{
                "message" => "",
               "@version" => "1",
                   "host" => "127.0.0.1",
             "@timestamp" => 2019-04-09T11:35:22.458Z,
    "request_path_length" => 3,
                "headers" => {
              "http_host" => "127.0.0.1:31311",
         "content_length" => "0",
           "request_path" => [
            [0] "",
            [1] "twitter_new",
            [2] "7"
        ],
            "http_accept" => "*/*",
           "http_version" => "HTTP/1.1",
        "http_user_agent" => "curl/7.64.1",
         "request_method" => "PUT"
    },
               "index_id" => "twitter_new"
}

As you can see, document_id is not set to 7, even though that is what I would expect.

How can I fix this?

Try with this

if [request_path_length] == 3 {

Thanks for your reply. After changing my logstash.conf file as you suggested, I made another PUT request to /twitter_new/8. I then made a GET request to retrieve all entries, and this was the entry corresponding to the latest PUT request I made:

{
	"_index": "twitter_new",
	"_type": "doc",
	"_id": "O5AIAmoBCWsefMj-o7Fw",
	"_score": 1,
	"_source": {
		"message": "",
		"document_id": "8",
		"@version": "1",
		"@timestamp": "2019-04-09T12:18:00.665Z",
		"index_id": "twitter_new",
		"request_path_length": 3,
		"headers": {
			"request_path": [
				"",
				"twitter_new",
				"8"
			],
			"http_accept": "*/*",
			"http_version": "HTTP/1.1",
			"content_length": "0",
			"request_method": "PUT",
			"http_user_agent": "curl/7.64.1",
			"http_host": "127.0.0.1:31311"
		},
		"host": "127.0.0.1"
	}
}

As you can see, inside source, document_id is indeed set to 8, but _id still a randomly generated string. I would expect _id to be 8 as well, just like how _index is twitter_new. Am I misunderstanding something?

Did you change that if in the output section as well as the filter section?

Ah, of course! Silly me.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.