I am using the jdbc river for elasticsearch to index mysql table data.
My River:
curl -XPUT 'localhost:9200/_river/river_mention_reports/_meta' -d '{
"type" : "jdbc",
"jdbc" : {
"driver" : "com.mysql.jdbc.Driver",
"url" : "jdbc:mysql://localhost:3306/ESTest1_development",
"user" : "root",
"password" : "password",
"sql" : "select * from table where creation_time >= (NOW() - INTERVAL 2 MINUTE)",
"poll" : "2m",
"versioning" : false
},
"index" : {
"index" : "monitoring",
"type" : "mention_reports"
}
}'
SQL query that I have specified in the river is:
select * from table where creation_time >= (NOW() - INTERVAL 2 MINUTE)
Now the problem is, the river after every poll removes the data that was indexed outside the time range(current minus 2 minutes) specified in the query, instead of adding fresh data to the index. The reason I have specified a time range is because I don't want the river to reindex the entire dataset again and again.