Redis connect to local socket file in Ruby filter?

Hey folks,

I am trying to perform some key-value pair lookups using Redis in a Ruby filter in Elastic 6.0.0. I have the lookup working (filter shown below):

filter {
  ruby {
    init => "require 'redis'; require 'time'"
    code => 'start_time = Time.now
             rc = Redis.new(host: "127.0.0.1", port: 6379, db: 3)
             event.set("redis_val", rc.hget("redis_test", "first_val"))
             end_time = Time.now
             event.set("elapsed", end_time - start_time)'
  }
}

This works fine. I'm capturing the time required for the processing of the filter in the "elapsed" field, which is consistently around 0.001 to 0.002 seconds. However, now having tasted the sweet taste of Redis connectivity, I want it to be faster!

So, I was wondering if it is possible to use the Redis Ruby library to connect to a local UNIX socket file on the host machine's file system. I want to know if the Redis filter is sandboxed in some fashion. I'd prefer to connect to the socket file to avoid the RTT through the network stack. Even though loopback connections are very fast, there's nothing as fast as direct file I/O.

I suspect the overhead of setting up a connection for each event will overshadow any benefit of using a Unix socket, but to answer your question, no, there's no sandbox here.

I suspect you're right. I don't know of any way to create a persistent Redis connection, though. I'd love to be able to do that. Speed is what this game is all about. Thanks!

As it turns out, the local socket connection is faster, enough so that it's worth doing it that way as opposed to over the network. It's consistently below 1 msec now, usually around 0.0007 seconds. Here's the redesigned filter which doesn't cause a ruby exception if the Redis server can't be found.

filter {
  # Test filter that just uses the Ruby filter to add a field to every record
  ruby {
    init => "require 'redis'; require 'time'"
    code => 'start_time = Time.now
             begin
               #rc = Redis.new(host: "127.0.0.1", port: 6379, db: 3)
               rc = Redis.new(path: "/tmp/redis.sock", port: 6379, db: 3)
               event.set("redis_status", "OK")
               event.set("redis_val", rc.hget("redis_test", "first_val"))
             rescue
               event.set("redis_status", "Cannot connect")
             end
             end_time = Time.now
             event.set("redis_elapsed", end_time - start_time)'
  }
}

It would be useful to be able to establish a persistent connection to the Redis instance to speed this up any further. Any chance of that? I imagine such a thing would be great for JDBC database connections as well.

It's also a good idea to specify a short connection_timeout value as a Redis.new parameter. I set mine to 0.0005 seconds, so the rescue block is invoked faster if I can't establish a connection. Otherwise the time spent in the filter is much longer, slowing down the whole pipeline. Better to fail fast.

It would be useful to be able to establish a persistent connection to the Redis instance to speed this up any further. Any chance of that?

Perhaps you can do that in the init section of the ruby filter? Otherwise you'd probably have to write a custom plugin.

I imagine such a thing would be great for JDBC database connections as well.

The jdbc_streaming filter should take care of that.

I thought the init section was simply to define the requirements for the plugin to ensure they loaded first. Does the init section load only once when Logstash starts? If so, do the variables defined in it have global scope to all Ruby filters? That could do the job....

I believe the init section is run when that particular ruby filter instance is initialized.

Very good. I will give it a try and let you know if it works, and if so, what effect (if any) it has on the performance of the filter. Thanks for the idea!

This worked, but I had to explicitly define the Redis connection variable in global scope with the $. See below. Vielen dank for the suggestion, Magnus!

filter {
  # Test filter that just uses the Ruby filter to add a field to every record
  ruby {
    init => 'require "redis";
             require "time";
             $rc = Redis.new(path: "/tmp/redis.sock", port: 6379, db: 3)'
    code => 'start_time = Time.now
             begin
               event.set("redis_status", "OK")
               event.set("redis_val", $rc.hget("redis_test", "first_val"))
             rescue
               event.set("redis_status", "Cannot connect")
             end
             end_time = Time.now
             event.set("redis_elapsed", end_time - start_time)'
  }
}
1 Like

One more note: it appears that the globals defined in ruby init filters are TRULY global. I created another filter similar to the one described above, except without the Redis.new declaration in the init block, and I was still able to access $rc from that filter.

So the caveat here is that you must make sure you define distinct global variables across all Ruby filters which use this technique for sharing connections.

Also, the time savings on the Redis connection from using a single shared Redis connection is extremely substantial. The average elapsed time is now occasionally as low as 0.049 msec, though it seems to hover around 0.07 msec most of the time.

I call that a huge win.

1 Like

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.