Suppose I have MySQL database and I connected with this database through ELK. In MySQL database have 1 lac records and that all data is shown in kibana. After sometimes mysql updated with 50k records. So my question is how ELK fetch data from MySQL? It traverses all the MySQL records or fetches data in incremental manners (only latest records). If it again travesrse all the records so that times MySQL database will get more load and maybe it will go slow.
That's not what I meant. I meant direct connection from your application to elasticsearch. I did not mean reading the data later with Logstash and send to elasticsearch.
The former is "real time". The later is not.
Next time please format your code according to the guide (read the About the Elasticsearch category). I'm editing your post. But your indentation is wrong which makes harder to read your config.
Then, I'm moving your question to logstash as it's a logstash question.
This is wrong. Use LOG_ID and not %{LOG_ID} if you want it to use the LOG_ID column for tracking.
However, this won't do any good unless you add a condition to your SELECT clause to restrict the query from returning rows older than the recorded value of the LOG_ID column. See the jdbc input documentation for examples.
First I have to remove the grok filter. In my conf file, I have used only select statement without any conditions. What will be the best way to not giving so much load on MySQL database. I want to setup Elasticsearch in such a way that it can fetch data in an incremental manner without fetching mysql database again and again. Suppose when new entries will come in mysql database so that time Elasticsearch will fetch those records only.
This is what the sql_last_value SQL parameter is used for. After each query execution Logstash records a column value from the last processed row and when the query runs the next time that value will be put in the sql_last_value parameter. Use that parameter in your query to only fetch values that are more recent. Obviously, the column you use for this purpose must be a "last modified" timestamp or something else that's ever increasing.
Again, this is explained (with examples) in the jdbc input documentation.
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.