Index : MySQL to Elasticsearch 6.2 with Logstash 6.2

Hi,

I'm trying to use Logstash 6.2 and mysql-connector-java-8.0.11 to import rows from MySQL server to ElasticSearch 6.2. Everything seems working unless the creation of the index in Elasticsearch. I thought it was automatically created but apparently it isn't.

How to create an index according to the mysql configuration (even if this is absolutely not the same architecture) ?

Here my Logstash conf :
input {
jdbc {
jdbc_connection_string => "jdbc:mysql://address:3306/db"
jdbc_user => "user"
jdbc_password => "pass"
jdbc_validate_connection => true
jdbc_driver_library => "./../mysql-connector-java-8.0.11/mysql-connector-java-8.0.11-bin.jar"
jdbc_driver_class => "com.mysql.jdbc.Driver"
jdbc_fetch_size => 100
schedule => "* * * * *"
statement => "SELECT * FROM table"
}
}
output {
elasticsearch {
index => "indexer"
document_type => "typer"
document_id => "%{id}"
hosts => ["localhost:9200"]
user => "user"
password => "pass"
template_overwrite => true
template => "/etc/logstash/defaultemplate.json"
}
}

And the template can be found here : https://github.com/logstash-plugins/logstash-output-elasticsearch/blob/master/lib/logstash/outputs/elasticsearch/elasticsearch-template-es6x.json

Thanks for reading me and the help you could offer :slight_smile:

Elasticsearch, by default, will automatically create the index on first document being indexed.

Why do you think that the index indexer was not created?

Thank you for your reply. :slight_smile:
When I enter

curl localhost:9200/_cat/indices

I didn't see the index listed.
So I created it manually. But it never fills.

I use x-pack, if it changes something.

The index should have been created automatically when the first document gets indexed.

That being said, if you created it manually and still doesn't fills I suspect that you are with indexing errors in Logstash.

Please, attach Logstash logs, also attach the content of "/etc/logstash/defaultemplate.json"

Here, my logs for Logstash

[2018-05-13T19:24:06,168][INFO ][logstash.runner ] Starting Logstash {"logstash.version"=>"6.2.4"}
[2018-05-13T19:24:06,279][INFO ][logstash.agent ] Successfully started Logstash API endpoint {:port=>9600}
[2018-05-13T19:24:06,667][WARN ][logstash.outputs.elasticsearch] You are using a deprecated config setting "document_type" set in elasticsearch. Deprecated settings will continue to work, but are scheduled for removal from logstash in the future. Document types are being deprecated in Elasticsearch 6.0, and removed entirely in 7.0. You should avoid this feature If you have any questions about this, please visit the logstash channel on freenode irc. {:name=>"document_type", :plugin=><LogStash::Outputs::Elasticsearch hosts=>[http://127.0.0.1:9200], bulk_path=>"/_xpack/monitoring/_bulk?system_id=logstash&system_api_version=2&interval=1s", manage_template=>false, document_type=>"%{[@metadata][document_type]}", sniffing=>false, user=>"user", password=>, id=>"5945ca2417ad319868d778c7400d214bb01ea322355cce4f402badcb5090c9df", enable_metric=>true, codec=><LogStash::Codecs::Plain id=>"plain_c816780e-8635-47c6-8ad3-553370d2452c", enable_metric=>true, charset=>"UTF-8">, workers=>1, template_name=>"logstash", template_overwrite=>false, doc_as_upsert=>false, script_type=>"inline", script_lang=>"painless", script_var_name=>"event", scripted_upsert=>false, retry_initial_interval=>2, retry_max_interval=>64, retry_on_conflict=>1, action=>"index", ssl_certificate_verification=>true, sniffing_delay=>5, timeout=>60, pool_max=>1000, pool_max_per_route=>100, resurrect_delay=>5, validate_after_inactivity=>10000, http_compression=>false>}
[2018-05-13T19:24:06,675][INFO ][logstash.pipeline ] Starting pipeline {:pipeline_id=>".monitoring-logstash", "pipeline.workers"=>1, "pipeline.batch.size"=>2, "pipeline.batch.delay"=>50}
[2018-05-13T19:24:06,922][INFO ][logstash.outputs.elasticsearch] Elasticsearch pool URLs updated {:changes=>{:removed=>, :added=>[http://user:pass@127.0.0.1:9200/]}}
[2018-05-13T19:24:06,926][INFO ][logstash.outputs.elasticsearch] Running health check to see if an Elasticsearch connection is working {:healthcheck_url=>http://user:pass@127.0.0.1:9200/, :path=>"/"}
[2018-05-13T19:24:07,107][WARN ][logstash.outputs.elasticsearch] Restored connection to ES instance {:url=>"http://user:pass@127.0.0.1:9200/"}
[2018-05-13T19:24:07,146][INFO ][logstash.outputs.elasticsearch] ES Output version determined {:es_version=>6}
[2018-05-13T19:24:07,146][WARN ][logstash.outputs.elasticsearch] Detected a 6.x and above cluster: the type event field won't be used to determine the document _type {:es_version=>6}
[2018-05-13T19:24:07,148][INFO ][logstash.outputs.elasticsearch] New Elasticsearch output {:class=>"LogStash::Outputs::Elasticsearch", :hosts=>["http://127.0.0.1:9200"]}
[2018-05-13T19:24:07,209][INFO ][logstash.licensechecker.licensereader] Elasticsearch pool URLs updated {:changes=>{:removed=>, :added=>[http://user:pass@127.0.0.1:9200/]}}
[2018-05-13T19:24:07,209][INFO ][logstash.licensechecker.licensereader] Running health check to see if an Elasticsearch connection is working {:healthcheck_url=>http://user:pass@127.0.0.1:9200/, :path=>"/"}
[2018-05-13T19:24:07,213][WARN ][logstash.licensechecker.licensereader] Restored connection to ES instance {:url=>"http://user:pass@127.0.0.1:9200/"}
[2018-05-13T19:24:07,218][INFO ][logstash.licensechecker.licensereader] ES Output version determined {:es_version=>6}
[2018-05-13T19:24:07,219][WARN ][logstash.licensechecker.licensereader] Detected a 6.x and above cluster: the type event field won't be used to determine the document _type {:es_version=>6}
[2018-05-13T19:24:07,248][INFO ][logstash.pipeline ] Pipeline started successfully {:pipeline_id=>".monitoring-logstash", :thread=>"#<Thread:0x518f2678@/usr/share/logstash/logstash-core/lib/logstash/pipeline.rb:247 sleep>"}
[2018-05-13T19:24:07,966][WARN ][logstash.outputs.elasticsearch] You are using a deprecated config setting "document_type" set in elasticsearch. Deprecated settings will continue to work, but are scheduled for removal from logstash in the future. Document types are being deprecated in Elasticsearch 6.0, and removed entirely in 7.0. You should avoid this feature If you have any questions about this, please visit the logstash channel on freenode irc. {:name=>"document_type", :plugin=><LogStash::Outputs::Elasticsearch hosts=>[//localhost:9200], user=>"user", password=>, sniffing=>true, manage_template=>false, index=>"%{[@metadata][beat]}-%{+YYYY.MM.dd}", document_type=>"%{[@metadata][type]}", id=>"86c7d5c366bfbb8a3698157dcf310bc815f5323314055bddc92f3e80ddcc5300", enable_metric=>true, codec=><LogStash::Codecs::Plain id=>"plain_d8128abc-fdc3-440a-97e6-b77b785841bf", enable_metric=>true, charset=>"UTF-8">, workers=>1, template_name=>"logstash", template_overwrite=>false, doc_as_upsert=>false, script_type=>"inline", script_lang=>"painless", script_var_name=>"event", scripted_upsert=>false, retry_initial_interval=>2, retry_max_interval=>64, retry_on_conflict=>1, action=>"index", ssl_certificate_verification=>true, sniffing_delay=>5, timeout=>60, pool_max=>1000, pool_max_per_route=>100, resurrect_delay=>5, validate_after_inactivity=>10000, http_compression=>false>}
[2018-05-13T19:24:08,006][INFO ][logstash.pipeline ] Starting pipeline {:pipeline_id=>"main", "pipeline.workers"=>8, "pipeline.batch.size"=>125, "pipeline.batch.delay"=>50}
[2018-05-13T19:24:08,018][INFO ][logstash.outputs.elasticsearch] Elasticsearch pool URLs updated {:changes=>{:removed=>, :added=>[http://user:pass@localhost:9200/]}}
[2018-05-13T19:24:08,020][INFO ][logstash.outputs.elasticsearch] Running health check to see if an Elasticsearch connection is working {:healthcheck_url=>http://user:pass@localhost:9200/, :path=>"/"}
[2018-05-13T19:24:08,029][WARN ][logstash.outputs.elasticsearch] Restored connection to ES instance {:url=>"http://user:pass@localhost:9200/"}
[2018-05-13T19:24:08,037][INFO ][logstash.outputs.elasticsearch] ES Output version determined {:es_version=>6}
[2018-05-13T19:24:08,037][WARN ][logstash.outputs.elasticsearch] Detected a 6.x and above cluster: the type event field won't be used to determine the document _type {:es_version=>6}

Due the fact that I can't post more than 7000 characters, I split my logs :wink:

[2018-05-13T19:24:08,039][INFO ][logstash.outputs.elasticsearch] New Elasticsearch output {:class=>"LogStash::Outputs::Elasticsearch", :hosts=>["//localhost:9200"]}
[2018-05-13T19:24:08,046][INFO ][logstash.outputs.elasticsearch] Elasticsearch pool URLs updated {:changes=>{:removed=>, :added=>[http://user:pass@localhost:9200/]}}
[2018-05-13T19:24:08,047][INFO ][logstash.outputs.elasticsearch] Running health check to see if an Elasticsearch connection is working {:healthcheck_url=>http://user:pass@localhost:9200/, :path=>"/"}
[2018-05-13T19:24:08,053][WARN ][logstash.outputs.elasticsearch] Restored connection to ES instance {:url=>"http://user:pass@localhost:9200/"}
[2018-05-13T19:24:08,059][INFO ][logstash.outputs.elasticsearch] ES Output version determined {:es_version=>6}
[2018-05-13T19:24:08,060][WARN ][logstash.outputs.elasticsearch] Detected a 6.x and above cluster: the type event field won't be used to determine the document _type {:es_version=>6}
[2018-05-13T19:24:08,060][INFO ][logstash.outputs.elasticsearch] Using mapping template from {:path=>nil}
[2018-05-13T19:24:08,064][INFO ][logstash.outputs.elasticsearch] Attempting to install template {:manage_template=>{"template"=>"logstash-", "version"=>60001, "settings"=>{"index.refresh_interval"=>"5s"}, "mappings"=>{"default"=>{"dynamic_templates"=>[{"message_field"=>{"path_match"=>"message", "match_mapping_type"=>"string", "mapping"=>{"type"=>"text", "norms"=>false}}}, {"string_fields"=>{"match"=>"", "match_mapping_type"=>"string", "mapping"=>{"type"=>"text", "norms"=>false, "fields"=>{"keyword"=>{"type"=>"keyword", "ignore_above"=>256}}}}}], "properties"=>{"@timestamp"=>{"type"=>"date"}, "@version"=>{"type"=>"keyword"}, "geoip"=>{"dynamic"=>true, "properties"=>{"ip"=>{"type"=>"ip"}, "location"=>{"type"=>"geo_point"}, "latitude"=>{"type"=>"half_float"}, "longitude"=>{"type"=>"half_float"}}}}}}}}
[2018-05-13T19:24:08,072][INFO ][logstash.outputs.elasticsearch] New Elasticsearch output {:class=>"LogStash::Outputs::Elasticsearch", :hosts=>["//localhost:9200"]}
[2018-05-13T19:24:08,484][INFO ][logstash.inputs.beats ] Beats inputs: Starting input listener {:address=>"0.0.0.0:5044"}
[2018-05-13T19:24:08,675][INFO ][logstash.pipeline ] Pipeline started successfully {:pipeline_id=>"main", :thread=>"#<Thread:0x55887bb9@/usr/share/logstash/logstash-core/lib/logstash/pipeline.rb:247 sleep>"}
[2018-05-13T19:24:08,678][INFO ][org.logstash.beats.Server] Starting server on port: 5044
[2018-05-13T19:24:08,688][INFO ][logstash.agent ] Pipelines running {:count=>2, :pipelines=>[".monitoring-logstash", "main"]}
[2018-05-13T19:24:08,689][INFO ][logstash.inputs.metrics ] Monitoring License OK
[2018-05-13T19:24:13,051][INFO ][logstash.outputs.elasticsearch] Elasticsearch pool URLs updated {:changes=>{:removed=>[http://user:pass@localhost:9200/], :added=>[http://user:pass@127.0.0.1:9200/]}}
[2018-05-13T19:24:13,053][INFO ][logstash.outputs.elasticsearch] Running health check to see if an Elasticsearch connection is working {:healthcheck_url=>http://user:pass@127.0.0.1:9200/, :path=>"/"}
[2018-05-13T19:24:13,057][WARN ][logstash.outputs.elasticsearch] Restored connection to ES instance {:url=>"http://user:pass@127.0.0.1:9200/"}

And here, you can find the /etc/logstash/defaultemplate.json that I found there :

{
    "template" : "places",
    "version" : 60001,
    "settings" : {
    "index.refresh_interval" : "5s"
 },
  "mappings" : {
     "_default_" : {
     "dynamic_templates" : [ {
        "message_field" : {
          "path_match" : "message",
      "match_mapping_type" : "string",
      "mapping" : {
        "type" : "text",
        "norms" : false
      }
    }
  }, {
    "string_fields" : {
      "match" : "*",
      "match_mapping_type" : "string",
      "mapping" : {
        "type" : "text", "norms" : false,
        "fields" : {
          "keyword" : { "type": "keyword", "ignore_above": 256 }
        }
      }
    }
  } ],
  "properties" : {
    "@timestamp": { "type": "date"},
    "@version": { "type": "keyword"},
    "geoip"  : {
      "dynamic": true,
          "properties" : {
            "ip": { "type": "ip" },
            "location" : { "type" : "geo_point" },
            "latitude" : { "type" : "half_float" },
            "longitude" : { "type" : "half_float" }
          }
        }
      }
    }
  }
}

But even without this template, the index never fills.

Thank you for your time :slight_smile:

Your log is presenting this:

[2018-05-13T19:24:08,484][INFO ][logstash.inputs.beats ] Beats inputs: Starting input listener {:address=>"0.0.0.0:5044"}

Which means that it is starting a Beats input while your configuration is just using JDBC. Also, the logs does not shows the JDBC starting up. So I would say that the configuration you presented here is not the one being used by Logstash.

Additionally, this is not directly related, but it is worth mentioning that this index template will not be used for the indexer index since it's template attribute is set to places. If you wish to use that template for indexer index then you need to set "template": "indexer".

This is because I changed the name of the index. I also have some others config files which work.
My problem only comes with JCDB's config file.

Ok, so run Logstash with --log.level debug, let it run for a few seconds (to produce more logs), upload the content to a gist (or maybe pastebin) and post the link here

I usually launch Logstash as service. But I tried this command

sudo /usr/share/logstash/bin/logstash -f --log.level=debug --path.settings /etc/logstash/conf.d/places.conf

Which gave me this error

ERROR: Failed to load settings file from "path.settings". Aborting... path.setting=/etc/logstash/conf.d/places.conf, exception=Java::JavaLang::RuntimeException, message=>Unhandled IOException: java.io.IOException: unhandled errno: Not a directory
[ERROR] 2018-05-14 11:41:59.673 [main] Logstash - java.lang.IllegalStateException: org.jruby.exceptions.RaiseException: (SystemExit) exit

So my conclusion is JCDB is not allowed to read content of conf.d, is that true ?

You can still start it as a service and you may set log.level in /etc/logstash/logstash.yml

I added log.level in logstash.yml and I restarted logstash via service. Here, you can find every logs since the restart.
Again, thank you ! :slight_smile:

When you execute the SQL Query SELECT id, extID, name, address, postcode, country, city FROM Place does it returns any result at all?

Yes, it does

Ok, so we need tracing here. Set it to log.level: trace and re-execute.

Here, the logs with trace :slight_smile:

Can you please let it run for a few minutes? Also, when you execute the query in MySQL directory (through another client), how long does it takes until it finishes?

Sure ! Here during 12 minutes.
When I execute the query in another client it's immediate.

I don't see any issues in the logs so either the database table is empty or rows got into Elasticsearch. What's the output of:

curl localhost:9200/_cat/indices?v

Only indices with a point in the front (i.e. .monitoring-logstash-6-2018.05.10) all in green and an index 'cat' in yellow. No other index.

But is it possible that is due by something else like JAVA or JCDB ? Because of there is this issue:

[2018-05-14T12:36:04,792][DEBUG][io.netty.util.internal.NativeLibraryLoader] Unable to load the library 'netty_tcnative_linux_x86_64', trying other loading mechanism.
java.lang.UnsatisfiedLinkError: no netty_tcnative_linux_x86_64 in java.library.path