Logstash Pipeline for Microsoft SQL Server to Elasticsearch

I have a Microsoft SQL Server which I wish to ( contuously) push into Elasticsearch.

In my base directory I have a docker-compose.yml file

version: "3.7"

services:
        backend:
                build: ./flask
                image: backend
                container_name: backend
                restart: always
                volumes:
                        - ./flask:/usr/src/app
                ports:
                        - 5000:5000
                command: python manage.py run -h 0.0.0.0
                env_file:
                        - .env
                depends_on:
                        - mssql
                networks:
                        - microsoftSQL
        mssql:
                image: mcr.microsoft.com/mssql/server:2019-GA-ubuntu-16.04
                container_name: sql1
                restart: always
                ports:
                        - 1433:1433
                environment:
                        - ACCEPT_EULA=Y
                        - SA_PASSWORD=${SQL_PASSWORD}
                networks:
                        - microsoftSQL
                
        elasticsearch:
                image: elasticsearch:7.5.1
                container_name: elasticsearch
                environment:
                        - cluster.name=docker-cluster
                        - bootstrap.memory_lock=true
                        - discovery.type=single-node
                        - "ES_JAVA_OPTS=-Xms512m -Xmx512m"
                ulimits:
                        memlock:
                                soft: -1
                                hard: -1
                volumes:
                        - esdata1:/usr/share/elasticsearch/data
                restart: always
                ports:
                        - 9200:9200
                networks:
                        - microsoftSQL
               
        logstash:
                build: ./logstash
                environment:
                        LS_JAVA_OPTS: "-Xmx256m -Xms256m"
                env_file:
                        - .env
                ports:
                        - '5001:5001'
                container_name: logstash
                restart: always
                environment:
                        - DEBUG=1
                networks:
                        - microsoftSQL
                links:
                        - elasticsearch
                depends_on:
                        - elasticsearch

        kibana:
                image: docker.elastic.co/kibana/kibana:7.5.1
                container_name: kibana
                restart: always
                ports:
                        - 5601:5601
                environment:
                        - ELASTICSEARCH_URL=http://elasticsearch:9200
                networks:
                        - microsoftSQL
                links:
                        - elasticsearch

networks:
        microsoftSQL:
                driver: bridge

volumes:
        db-data:
        esdata1:
                driver: local

the relevenat files in my directory are

logstash/
├── Dockerfile
└── config
    └── logstash.conf

where Dockerfile:
FROM docker.elastic.co/logstash/logstash:7.5.1

RUN rm -f /usr/share/logstash/pipeline/logstash.conf

ADD config/ /usr/share/logstash/config

USER root 

#CMD bin/logstash-plugin install logstash-input-jdbc
#COPY mssql-jdbc-*.jre9.jar /opt/

# Add logstash plugins setup here:
RUN logstash-plugin install logstash-input-jdbc

and logstash.conf

input {
  jdbc {
    jdbc_driver_library => "/opt/mssql-jdbc-7.5.1.jre9.jar"
    jdbc_driver_class => "com.microsoft.sqlserver.jdbc.SQLServerDriver"
    jdbc_connection_string => "jdbc:sqlserver://sql1:1433.database.windows.net"
    jdbc_user => "******"
    jdbc_password => "******"
    statement => "SELECT * from TestDB.Inventory;"
    type => "db-logs-access"
  }
}

are as above. I think I should expect MS SQL to index to Elasticsearch from all this, but what I actually see is Logstash terminating

[2019-12-20T10:08:40,557][INFO ][logstash.outputs.elasticsearch] New Elasticsearch output {:class=>"LogStash::Outputs::Elasticsearch", :hosts=>["http://elasticsearch:9200"]}
[2019-12-20T10:08:40,686][INFO ][logstash.javapipeline ] Starting pipeline {:pipeline_id=>".monitoring-logstash", "pipeline.workers"=>1, "pipeline.batch.size"=>2, "pipeline.batch.delay"=>50, "pipeline.max_inflight"=>2, "pipeline.sources"=>["monitoring pipeline"], :thread=>"#<Thread:0x63c4280a run>"}
[2019-12-20T10:08:40,737][INFO ][logstash.javapipeline ] Pipeline started {"pipeline.id"=>".monitoring-logstash"}
[2019-12-20T10:08:40,831][INFO ][logstash.agent ] Pipelines running {:count=>1, :running_pipelines=>[:".monitoring-logstash"], :non_running_pipelines=>}
[2019-12-20T10:08:41,171][INFO ][logstash.agent ] Successfully started Logstash API endpoint {:port=>9600}
[2019-12-20T10:08:41,926][INFO ][logstash.javapipeline ] Pipeline terminated {"pipeline.id"=>".monitoring-logstash"}
[2019-12-20T10:08:42,630][INFO ][logstash.runner ] Logstash shut down.

So the pipeline appears to shut down pretty quickly after starting. I haven't overwritten the default pipelines?

Hi

I'll take a shot at it.

You don't have a scheduler in your jdbc input, nor a filter{}, nor an output{}.

My guess is that logstash is executing your input{} exactly once, without error, and finishes, since it has nothing else to do.

You need, at least, an output{} to send your data to elasticsearch, and probably a filter{} to prepare it for that output. And then a scheduler to keep it working at the rate you wish.

Hope this helps

Hi Jordi,

Yes adding an output makes sense, thanks! I think I just blindly copied that file off of another post without properly looking into what it's there for, my mistake. Having now read your post and also looked at the documentation for that config file, my understanding is that I should have sufficent to see some output in Elasticsearch.

input {
  jdbc {
    jdbc_driver_library => "/opt/mssql-jdbc-7.5.1.jre9.jar"
    jdbc_driver_class => "com.microsoft.sqlserver.jdbc.SQLServerDriver"
    jdbc_connection_string => "jdbc:sqlserver://sql1:1433.database.windows.net"
    jdbc_user => "******"
    jdbc_password => "******"
    statement => "SELECT * from Inventory;"
    type => "db-logs-access"
  }
}

output {
  elasticsearch { 
    hosts => [ "elasticsearch:9200"]
    index => "fruits"
  }
  stdout
}

My error doesn't look changed however so I think perhaps the issue lies in one of the following places:

jdbc_connection_string

format:
# "jdbc:sqlserver://HostName\instanceName;database=DBName;user=UserName;password=Password"

tried:
jdbc_connection_string => "jdbc:sqlserver://sql1:1433.database.windows.net"

Also tried:
jdbc_connection_string => "jdbc:sqlserver://sql1\32446f86432d:1433.database.windows.net"

Statement

Tried
statement => "SELECT * from Inventory;"

Also tried
statement => "SELECT * from TestDB.Inventory;"

I have following for sqlserver connection and it is working

jdbc_driver_class => "com.microsoft.sqlserver.jdbc.SQLServerDriver"
        
        jdbc_connection_string => "jdbc:sqlserver://db_host_name;databaseName=database_name;user=logstash;password=logstash_password_on_sqlserver;"
        jdbc_user => "logstash"
        jdbc_password => "logstash_password_on_sqlserver"
        statement => "select * from database_name.full_table_name"
        clean_run=>true

and I have this file
ls -la /usr/share/logstash/logstash-core/lib/jars/mssql-jdbc-7.4.1.jre8.jar
-rw-r--r-- 1 logstash logstash 1209659 Sep 17 13:06 /usr/share/logstash/logstash-core/lib/jars/mssql-jdbc-7.4.1.jre8.jar

Hi so I went away and tried this. My directory structure is now:

logstash/
├── Dockerfile
├── config
│   └── logstash.conf
└── mssql-jdbc-7.4.1.jre8.jar

Where mssql-jdbc-7.4.1.jre8.jar was manually downloaded and dropped in then folder,

My Dockerfile:

    ADD config/ /usr/share/logstash/config

    USER root 

    COPY mssql-jdbc-7.4.1.jre8.jar /usr/share/logstash/logstash-core/lib/jars/mssql-jdbc-7.4.1.jre8.jar

    # Add logstash plugins setup here:
    RUN logstash-plugin install logstash-input-jdbc

My logstash.conf:

    input {
      jdbc {
        jdbc_driver_library => "/opt/mssql-jdbc-7.5.1.jre9.jar"
        jdbc_driver_class => "com.microsoft.sqlserver.jdbc.SQLServerDriver"
        jdbc_connection_string => "jdbc:sqlserver://sql1:1433;databaseName=TestDB;user=******;password=******;"
        jdbc_user => "******"
        jdbc_password => "******"
        statement => "SELECT * from TestDB.Inventory;"
        type => "db-logs-access"
      }
    }

    output {
      elasticsearch { 
        hosts => [ "elasticsearch:9200"]
        index => "fruits"
      }
      stdout
    }

Now, I don't get anything into ES from this. Do you know how your set up differs from mine?
If I run from the terminal:

docker exec -it sql1 "bash"
/opt/mssql-tools/bin/sqlcmd -S localhost -U *******

the mssql bash opens up and running

USE TestDB;
GO;

SELECT * FROM Inventory;
GO;

yields the data I would expect to end up in Elasticsearch

copy your jbdc to this dir "/usr/share/logstash/logstash-core/lib/jars/"

It is there, my Dockerfile copies it across,

running

docker exec logstash ls /usr/share/logstash/logstash-core/lib/jars

shows that the file ( mssql-jdbc-7.4.1.jre8.jar) is in that directory.

That's correct isn't it?

Yes, that file in that dir did the trick for me.

I do not know what else can be wrong. may be someone else know

I am still having this issue if anyone else has any suggestions?

Thanks
Jacob

Hi

I think you should remove your jdbc_driver_library line from your logstash.conf, since the driver is now installed on its default path.

In the versiion you posted above, it points to /opt/mssql-jdbc-7.5.1.jre9.jar, while you copied mssql-jdbc-7.4.1.jre8.jar to /usr/share/logstash/logstash-core/lib/jars

Hope this helps.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.