Pipeline Performance & jdbc_static filter

hermann102 · November 17, 2018, 9:54am

Hello everyone,

I am new to ELK and have a number of issues that I would like to share with you.

My difficulties concern the following points:

Pipeline Performance (FileBeat & Jbdc --> Logstash--> Elk-->Kibana)
Setting the nodes (Master, Data, ingest)
Using the jdbc_static plugin filter
The loading speed of kibana visualizations when we have a lot of documents in the index.

I. Pipeline Performance (FileBeat & Jbdc -->à Logstash--> Elk-->Kibana)

Context

A . Pipeline Architecture

We currently have an ELK pipeline based on the following architecture, (FileBeat & Jbdc -->à Logstash--> Elk-->Kibana)

We have 4 jdbc connectors at the logstash level that retrieve data from different tables to store them in different indexes.

B. System

Logstash--> Elk-->Kibana : run on, OS: Ubuntu, RAM 8G, CPU: Dual Core

FileBeat: run on, OS: windows, RAM 28G, CPU 2.3GX2

C. volumetric

For all our current jdbc connectors, we have one, which alone can collect to average 100 Million data in one fell swoop.

We can estimate an overall volume of 300 million data per month on average on this pipeline.

D. Nodes

All this traffic is managed by a single node (Default) currently.

E. Issue

The problem we are currently facing is that of performance. Loading 160 Million data on a jdbc connector took us 4 days (96 hours). It's huge for us.

We would like to load this data in 24 hours maximum.

F. My questions

What is the best sizing (A, B, D) we need to achieve our 160 million / 24-hour goals.

II. Setting the nodes

1 Is it possible to build a pipeline of several nodes (Master, Data, Ingest) on a single virtual machine?
2 Can you guide me to build a robust pipeline .

III. Using the jdbc_static plugin

So far we use jdbc plugins in the logstash input block for each sql table.

Input {
            Jbdc connector 1 for table 1
            Type 1

            Jbdc connector 2 for table 2
            Type 2 
}
Output {
           If (Type 1) {
              Index 1 for table 1
            }

             If (Type 2) {
              Index 2 for table 2
            }

}

But according to the documentation, it is possible to use at the filter level, jdbc_static to connect to several tables of the same base.

Filter{
jdbc_static {
#Loade Data From remote Database
loaders => [
{
query => "load data from remote table 1"
local_table => "save to lacal table 1"
}
{
query => "load data from remote table 2"
local_table => "save to local table 2"
}
]

#Set local Tabla to loade Data
local_db_objects => [
{
name => "set local table 1"
}
{
name => "set local table 2"
}
]
#Set loop Table
local_lookups => [
{
Query => "loop local table 1 put on field"
}
{
Query => "loop local table 2 put on field "
}
]
staging_directory => "****"
loader_schedule => "* * * * *" 
jdbc_user => "logstash"
jdbc_password => "example"
jdbc_driver_class => "*****"
jdbc_driver_library => "****"
jdbc_connection_string => "****"

      }

My questions

Can we store the local table 1 and table 2 data in a single index ?
How to put the type key word, to retrieve the data in an index at the output block ?
How put each local table to one index ?

 Output {
               If (Type) {
                  Data local table to one index 
               }
   }}

IV. kibana Slider loading kibana visualizations

When we have a lot of data in the index, Kibana takes too much time to load the visualizations. It even looks like the visualizations do not load all the data.
How to solve this problem.

Christian_Dahlqvist · November 17, 2018, 11:37am

What type of hardware is this running on? How many hosts? What does the load on the host look like, e.g. CPU and disk I/O and iowait?

hermann102 · November 17, 2018, 12:09pm

Let me check the exact information with the system administrator and come back to you.
I gave some system information in my post already ( B. System). for host, we have only one

hermann102 · November 19, 2018, 11:53am

Hello Christian Dahlqvist

Sorry for late reply, about you question, i have this information.
Thank for you help.

Size : Standard_D2s_v3

Virtual processors: 2

Memory : 8G

Temporary storage (SSD) in Gio: 16G

Data discs max: 4

Temporary storage rate and max cache: I / O per second / Mbps (cache size in Gio) : 4 000 / 32 (50)

Maximum disk speed without caching: I / O / MBps : 3 200 / 48

Max Number of NICs / Expected Network Bandwidth (MBps): 2 / 1 000

Christian_Dahlqvist · November 19, 2018, 12:10pm

What does CPU usage and disk performance look like? Do you have any monitoring installed?

hermann102 · November 19, 2018, 12:14pm

No i not have monitoring

Christian_Dahlqvist · November 19, 2018, 12:37pm

It is a very small host given you are running all the components on it, so I would suspect you may not have sufficient resources. Log into the host and use the top and iostat tools to give you an idea what is going on (assuming you are using Linux).

system · December 17, 2018, 12:46pm

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Logstash jdbc_static filter plugin performing very poorly (nearly not at all) Logstash	27	2510	February 18, 2020
Logstash 6.2.2 with JDBC data on not updated Logstash	6	921	May 4, 2018
Logstash jdbc-input taking long time to load data from Oracle DataBase Logstash	22	4034	October 5, 2020
Oracle DB - JDBC Plugin high load Logstash	1	223	April 15, 2021
Suggestion improving filebeat performance Beats filebeat	3	1232	November 24, 2017

Pipeline Performance & jdbc_static filter

Related topics