Logstash jdbc_static filter plugin performing very poorly (nearly not at all)

tomj · January 14, 2020, 9:08pm

Ugh, sorry, it's a tar.gz file

tomj · January 14, 2020, 9:10pm

New link:

Badger · January 14, 2020, 11:05pm

Looking at the thread dump, there are 32 worker threads for the "logs" pipeline.

"[logs]>worker0" is waiting to acquire a connection. So are threads 1, 3, 4, 6, 7, 8, 9, 10, 11, 12, 13, 14, 17, 18, 19, 22, 24, 25, 26, 28, 29, 30, 31.

"[logs]>worker2" is basically doing the same, it is just on a different wait on the mutex (as a result of a race condition -- something put the connection back into the pool, and this thread was notified, but failed to get the mutex because another thread beat it to it -- that is not a problem). So are threads 16, 20, 21.

"[logs]>worker5" is down under org.apache.derby.impl.jdbc.EmbedStatement.executeQuery, down under logstash.filters.jdbc.lookup.RUBY$method$fetch$0, doing a lookup in a jdbc_static filter. So are threads 15, 23, 27.

All the worker threads have been up for 2,507 seconds and have each used around 124 seconds of CPU. That tells me that they are all processing events. The fact that most of the threads are waiting for connections tells me the lookup queries are much more expensive that they should be. I cannot think why.

It would be really nice to be able to enable the logging that the underlying jdbc code has so that we could see the query execution times, but I do not see a way to do that without modifying the jdbc_static code.

tomj · January 14, 2020, 11:11pm

Thank you. I would be willing to try to enable the logging you refer to, but so far I have only rebuilt Logstash plugins to test certain hypotheses.

I'll look at the jdbc_static code tomorrow to see if I can get at that logging setting without having to rebuild the sequel gem.

tomj · January 18, 2020, 11:42pm

I'm going to mark this as resolved since I have decided to move on to using the translate filter with a CSV file that is periodically generated from SQL calls. After testings, this approach reduced the amount of CPU used and it is processing about 8000 events per second for each Logstash instance.

Christian_Dahlqvist · January 19, 2020, 6:47am

Another option might be to use the jdbc_streaming plugin. As far as I recall it does not rely on a local database which might mean your dataset with tuning might be held in memory as it is quite small.

tomj · January 21, 2020, 10:35pm

Thank you, I did consider that, but was biased against it by the behavior of jdbc_static, where the absence of any field to use as a lookup would generate a warning message for each such event.

This produced millions of warnings which I then ignored with the appropriate logging settings.

Now after looking at the jdbc_streaming code, maybe that doesn't act the same as jdbc_static. Regardless, I'm going ahead with the "translate" filter for now because I don't have to write a lot of extra code.

system · February 18, 2020, 10:35pm

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Logstash and databse Logstash	16	5695	July 6, 2017
Logstash-input-jdbc doesn't work on logstash 5.x Logstash	11	2447	January 30, 2017
Jdbc-streaming plugin seems not working Logstash	44	7863	June 21, 2017
Error registering plugin with jdbc_static plugin Logstash	8	863	February 20, 2019
Logstash high CPU on JDBC plugin Logstash	2	496	February 7, 2021

Logstash jdbc_static filter plugin performing very poorly (nearly not at all)

Related topics