Hi,
I use Logstash to synchronize a MariaDB to a ES index. But my problem is that I got all the queries, the data in my logstash logs, it becomes quite as big as the index.
How can I configure the stdout of Logstash to get only some data as
Here is my logstash pipeline :
input {
jdbc {
jdbc_driver_library => ".."
jdbc_driver_class => ".."
jdbc_connection_string => ".."
...
statement_filepath => '....sql'
}
}
filter {
.... # Here I format the input to correspond to the ES mapping
}
output {
elasticsearch {
hosts => '...'
document_id => "%{[@metadata][_id]}"
action => "%{[@metadata][_elasticsearch_action]}"
ilm_enabled => true
manage_template => false
ilm_rollover_alias => '...'
}
stdout { codec => rubydebug } # I'm trying to do something here
}
My output logs looks like :
{"log":"Using bundled JDK: /usr/share/logstash/jdk\n","stream":"stdout","time":"2021-06-18T08:53:02.063170815Z"}
{"log":"OpenJDK 64-Bit Server VM warning: Option UseConcMarkSweepGC was deprecated in version 9.0 and will likely be removed in a future release.\n","stream":"stderr","time":"2021-06-18T08:53:02.079546026Z"}
{"log":"Sending Logstash logs to /usr/share/logstash/logs which is now configured via log4j2.properties\n","stream":"stdout","time":"2021-06-18T08:53:13.260990754Z"}
..
{"log":"[2021-06-18T08:54:00,734][INFO ][logstash.inputs.jdbc ][main][4073ae3f0baaedb0df0a19d24c92321b495d2c58d37529f41296e82999c942ee] (0.032062s) SELECT count(*) AS \"COUNT\" FROM ...
**HERE IS A HUGE QUERY WHICH IS CALL EVERY MINUTE**
{"log":"WHERE (UNIX_TIMESTAMP(participations.ROW_START) \u003e 0 AND participations.ROW_START \u003c NOW() AND participations.ROW_END \u003e NOW())) AS \"T1\" LIMIT 1\n","stream":"stdout","time":"2021-06-18T08:54:00.73571993Z"}
...
And after I have all data from my database reported in this log file.
I'll use the logging parameter in my docker-compose file to block the log file size, but it's not what I want. I would like to have
{"log":"Using bundled JDK: /usr/share/logstash/jdk\n","stream":"stdout","time":"2021-06-18T08:53:02.063170815Z"}
{"log":"OpenJDK 64-Bit Server VM warning: Option UseConcMarkSweepGC was deprecated in version 9.0 and will likely be removed in a future release.\n","stream":"stderr","time":"2021-06-18T08:53:02.079546026Z"}
{"log":"Sending Logstash logs to /usr/share/logstash/logs which is now configured via log4j2.properties\n","stream":"stdout","time":"2021-06-18T08:53:13.260990754Z"}
..
**The time the query is launched but not all the query.**
**Only the identify of the documents from the MariaDB and not all the document content.**
Could someone have an idea to get this one please ?
Thanks a lot for your help,
Best,
audrey