How does logstash generate '_id' when the outputting to elastic search?

The elasticsearch _id parameter is set to some hash value. e.g.

"_id": "AVVpVdoIEU8QqcRWkA9P",

How is this calculated? And where?

matt@elk:~$ cat /etc/logstash/conf.d/apache.conf 
input {
    file {
        path => '/var/log/apache2/access.log'
    }
}
    
filter {
    grok {
        match => { "message" => "%{COMBINEDAPACHELOG}" }
    }
}
    
output {
    elasticsearch { 
    }
}

ES will automatically create an ID in case you don't specify any.

In this case, the Logstash elasticsearch output plugin will send the events to the _bulk endpoint without any _id and hence ES will create IDs for your newly indexed documents automatically. If you want to provide your own IDs from within Logstash, you can do so by specifying the document_id setting:

output {
    elasticsearch { 
        document_id => "%{some_id_field}"
    }
}
1 Like