Logstash Failing after setting a custom document ID


(Ganessen Mootheeveeren) #1

Hello,

I am using ELK version 6.4.0 on a Linux machine.
My logstash is failing when I'm trying to set a custom document id, but I am able to import data from the database to Elasticsearch if I remove the document_id parameter in my output conf file.
The reason I am using a custom document id is to prevent any duplicates of data being imported to Elasticsearch.

Error found in logstash logs:

[2018-09-11T06:36:16,953][ERROR][logstash.outputs.elasticsearch] Encountered a retryable error. Will Retry with exponential backoff {:code=>400, :url=>"http://localhost:9200/_bulk"}

This is my configuration file:

input {
            jdbc{
                jdbc_driver_library => "/data/sqljdbc_6.0/enu/jre8/sqljdbc42.jar"
                jdbc_driver_class => "com.microsoft.sqlserver.jdbc.SQLServerDriver"
                jdbc_connection_string => "jdbc:sqlserver://xx.xx.xx.xx:1433;instanceName=xxx;databasename=xxx"
                jdbc_user => "Elastic"
                jdbc_password => "xxxx"
                schedule => "* * * * *"
                statement => "SELECT TOP 1000 * FROM [xxxx].[xx].[table_x]"
        }
}

output {

                        elasticsearch {
                        hosts => ["localhost:9200"]
                        user => "elastic"
                        password => "password"
                        index => "rejection"
                        document_type => "rejection"
                        document_id => "%{[date_run]}_%{[nat_key]}"
                        }
                        stdout {}
        }

I found a post on that same matter in this forum, which tells us this is a permission problem:
(see link)

I am infact using the elastic user, user which normally has the high level of permissions, but logstash still is failing.
I have even tried the following users logstash_internal, logstash_admin which i created using this link but to no avail.

So, my questions would be:
Is there any way to use the document id without having logstash to fail?
Or is there any user that I have to create that would have the necessary permissions to overwrite the document_id?
Is there any way to ingest unique documents without the need of the parameter document_id?

Thanks in advance.
Ganessen.


(Ganessen Mootheeveeren) #2

Hello,

Can someone please help..

Thanks,
Ganessen


(Christian Dahlqvist) #3

What is the format of the fields you are using to create the document I’d? Can any of them contain inappropriate characters? Are these fields set for all documents?


(Ganessen Mootheeveeren) #4

Hello Christian,

I'm using the following 2 fields to construct the document.
Format in SQL server:
date_run ==> datatype datetime
nat_key ==> datatype nchar

Once imported to Elasticsearch, it turns out that the nat_key contains some inapropriate characters and some extra unnecessary space.


image

Could it be the reason for this failure?

Thanks,
Ganessen


(Christian Dahlqvist) #5

Yes, if the field contains inappropriate characters that is likely to be the reason. One way to get around this could perhaps be to hash the value into a SHA1 hash using the fingerprint module and use this as the document ID instead.


(system) #6

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.