Filter_jdbc_static & Apache Derby

Hi,
I'm having an issue regarding this filter. When I launch the jdbc_static and throw some data its way, the logs of internal Apache Derby instance says this:

java.lang.StackOverflowError
	at org.apache.derby.impl.sql.compile.TableOperatorNode.bindNonVTITables(Unknown Source)
	at org.apache.derby.impl.sql.compile.TableOperatorNode.bindNonVTITables(Unknown Source)
	at org.apache.derby.impl.sql.compile.TableOperatorNode.bindNonVTITables(Unknown Source)
	at org.apache.derby.impl.sql.compile.TableOperatorNode.bindNonVTITables(Unknown Source)

And the AD itself (obviously) crashes. Which I find very odd, given the circumstances the docs states:

MAX_ROWS: The default for this setting is 1 million. Because the lookup database is in-memory, it will take up JVM heap space. If the query returns many millions of rows, you should increase the JVM memory given to Logstash or limit the number of rows returned, perhaps to those most frequently found in the event data.

I'm trying to insert some 70.000 records (key + 3 values) which doesn't sound anywhere even close to 1 million... The JVM heap size is set to Xms8g and Xmx8g which I believe should be enough too.

Googling the Earth I found this issue related to Derby itself [DERBY-5981] Derby INSERT Eats Stack Space, Causes java.lang.StackOverflowError - ASF JIRA and experimenting with number of records that will not make the Derby a dead parrot on sight I discovered that feeding it ~7000 records really will not make it StackOverflowed.

So my question are:

  1. Are my conclusions correct and the AD is really having this kind of problem? Have any of you observed similar behaviour? The database I'm fetching the data from is Oracle.
  2. The suggested actions for similar problems I found were "turn off autocommit" and "upload data from CSV". I don't see any way of doing this at the plugin level, asi it does not accept any kind of parameters (or I was not able to find a way of doing this).
  3. What would Gandhi do now?

I ended up using jdbc_streaming filter which does basically the same thing (and is having a much more tuning options, pah!) but the data inside the source database change only once a day so my inner autistic self is really in pain here.

Any help appreciated, thank you.

(I'm not including my jdbc_static configuration asi it is basically the doc snippet itself.)

I like the cut of your jib.

I wrote the filter so I should know the answer but I have not yet seen too much use of it in varied scenarios.

From https://issues.apache.org/jira/browse/DERBY-1735, your conclusions about a high number of VALUES in the insert clause does trigger the Derby bug.

I will have to implement some form of paging for loader resultsets larger than, say, 5000 rows but I am not able do this immediately.

Thanks for reporting this.

1 Like

Hi, thanks for the reply. I'll monitor the changelog of the plugin ocassionally and if there is any update, will try it again. Thank you and have a nice day!

Hi Radim - The PR is up for review https://github.com/logstash-plugins/logstash-filter-jdbc_static/pull/19

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.