Logstash jdbc document_id => "%{uid}" problem

Marcin_Kubica · November 20, 2015, 5:44pm

Hi
Logstash newbie here

Using logstash 2.0.0 and elastic 2.0.0

I'm trying to follow https://www.elastic.co/blog/logstash-jdbc-input-plugin and MySQL driver.

I'm setting in config

The data is being read ok-ish, (got multiple records returned) however on Elastic side I'm getting only one document indexed. In place of usual _id it's geting _id:%{uid}

If I don't use document_id => "%{uid}" I'm getting multiple documents indexed reflecting the SQL query output. However I wanted to avoid data duplication.

Any ideas?
Marcin

warkolm · November 21, 2015, 5:54am

Can you provide your entire config?

Marcin_Kubica · November 21, 2015, 8:09pm

Hi Mark, thanks for asking.

input {
  jdbc {
    jdbc_driver_library => "/opt/logstash-2.0.0/vendor/mysql/mysql-connector-java-5.1.37-bin.jar"
    jdbc_driver_class => "com.mysql.jdbc.Driver"
    jdbc_connection_string => "jdbc:mysql://***:***/***"
    jdbc_user => "***"
    jdbc_password => "***"

    statement_filepath => "/opt/logstash-2.0.0/sql_queries/query3.sql"
  }
}

output {


#       stdout { codec=>rubydebug }

        elasticsearch {
                index => "****"
                document_type => "****"
                document_id => "%{uid}"
                hosts => "localhost:9200"
 }
}

Christian_Dahlqvist · November 21, 2015, 9:44pm

When you enable the stdout output, do you have a field named 'uid' that the document_id can be built from?

Marcin_Kubica · November 21, 2015, 10:30pm

Good question Christian!

In the result I don't have any ID field actually.

However in the example from the URL the result of SQL query also has no 'uid' thing. It's there seen in json output though.

And there you go - I've modified my sql query to return an id and used document_id => "%{id}" and this populated the row IDs into _id field.

I'm guessing it's possible to create a hash of returned values and have it as document_id?

Christian_Dahlqvist · November 22, 2015, 9:20am

You can use the fingerprint or checksum plugins to create a hash based on specific field(s) and then use this to set document_id. If you have one or more fields in your data that make up a unique key, you may however be better off concatenating these into a key without hashing it.

Marcin_Kubica · November 22, 2015, 7:10pm

Thanks!

Architha · January 8, 2016, 10:49am

Hi ,
I have been facing the same issue. Could you tell me how you have solved it ?
Thanks !

Marcin_Kubica · January 18, 2016, 7:05am

@Architha

From your SQL query you need to return one column with unique ID - a number (or maybe a hash).

Lets say, a name of that field will be my_ID

Then in your output section of logstash config you add (under elasticsearch { })

document_id => "%{my_ID}"

And that's all. Hope this helps!

Architha · January 20, 2016, 11:23am

@Marcin_Kubica

Is this possible with an already existing column with unique ID ?
Supposed i have an existing column Employee_id and config in the same way as mentioned (under elasticsearch{ } in Output filter)

document_id => "%{Employee_id}"

I still see "_id" : "%{Employee_id}" in the search results. Kindly help.

I am not very clear with "From your SQL query you need to return one column with unique ID - a number (or maybe a hash)."
Please elaborate
Thanks a lot !

Marcin_Kubica · January 20, 2016, 8:41pm

Hi

is Employee_id (number) returned ( displayed ) as result of your SQL
select statement? I take it is not Had exactly this problem. Make sure
to include it in SELECT statement.

Marcin

Architha · January 21, 2016, 5:59am

Hi Marcin ,

The problem was solved when i used "employee_id" instead of "Employee_id" in the output section of Logstash conf file. It is case sensitive.

Thanks for your reply Cheers !!

Topic		Replies	Views
When using logstash to index data from MySQL to ElasticSearch,only the first row is being displayed Logstash	5	1722	July 6, 2017
Logstash JDBC document_id not reading latest Logstash	5	1707	February 14, 2018
Logstash document_id for elasticsearch not incrementing Logstash	3	755	July 6, 2017
Logstash conf output to elasticsearch with document_id set only indexes one record Logstash	2	795	June 7, 2021
Is it possible to have a custom document_id in Logstash? Logstash	3	16545	January 17, 2017

Logstash jdbc document_id => "%{uid}" problem

Related topics