Getting info from website with docker's logstash image

I'm new to logstash, and I've been trying to get a docker container working with logstash in order to get data from the stock market for a project. Unfortunately, primarily due to the fact that I have no experience with this tool, I have been unable to get any info from any website (trying to get info from https://www.marketwatch.com/investing/index/djia but I've tried a few others).

Following are some of my configuration files and the response after running, but I'm not sure at this point what I'm doing right/wrong, or where the problem lies.

dow.conf:

input {
  #http_poller {
  #  urls => {
  #    myresource => "https://www.marketwatch.com/investing/index/djia"
  #    }
  #  }
    #request_timeout => 60
    #interval => 60
    #codec => "json" # set this if the response is json formatted
  http {
    host => "https://www.marketwatch.com/investing/index/djia"
    id => "dow_id"
  }
}
output {
 file {
   path => /usr/share/logstash/data/DOW.log
   #codec => line { format => "custom format: %{message}"}
 }
}

logstash.yml and pipelines.yml:

http.host: "https://www.marketwatch.com/investing/index/djia"

dockerfile:

FROM logstash:7.7.0
#OS is centos

#needed for permissions
USER root

#install python3, pip3
RUN yum -y install epel-release && yum clean all
RUN yum -y install python3-pip && yum clean all

#pip install necessary libraries
RUN pip3 install python3-logstash
RUN bin/logstash-plugin install logstash-input-http_poller
RUN bin/logstash-plugin install logstash-input-http

#transfer necessary files to image
RUN rm -f /usr/share/logstash/pipeline/logstash.conf
ADD pipeline/ /usr/share/logstash/pipeline/
ADD config/ /usr/share/logstash/config/

#create necessary directories and files with needed permissions
RUN touch /usr/share/logstash/data/DOW.log && chmod 777 /usr/share/logstash/data/DOW.log

response after 'logstash -f dow.conf':

OpenJDK 64-Bit Server VM warning: Option UseConcMarkSweepGC was deprecated in version 9.0 and will likely be removed in a future release.
WARNING: An illegal reflective access operation has occurred
WARNING: Illegal reflective access by com.headius.backport9.modules.Modules (file:/usr/share/logstash/logstash-core/lib/jars/jruby-complete-9.2.11.1.jar) to method sun.nio.ch.NativeThread.signal(long)
WARNING: Please consider reporting this to the maintainers of com.headius.backport9.modules.Modules
WARNING: Use --illegal-access=warn to enable warnings of further illegal reflective access operations
WARNING: All illegal access operations will be denied in a future release
Sending Logstash logs to /usr/share/logstash/logs which is now configured via log4j2.properties
[2020-06-13T17:16:04,100][INFO ][logstash.setting.writabledirectory] Creating directory {:setting=>"path.queue", :path=>"/usr/share/logstash/data/queue"}
[2020-06-13T17:16:04,150][INFO ][logstash.setting.writabledirectory] Creating directory {:setting=>"path.dead_letter_queue", :path=>"/usr/share/logstash/data/dead_letter_queue"}
[2020-06-13T17:16:04,849][WARN ][logstash.config.source.multilocal] Ignoring the 'pipelines.yml' file because modules or command line options are specified
[2020-06-13T17:16:04,863][INFO ][logstash.runner          ] Starting Logstash {"logstash.version"=>"7.7.0"}
[2020-06-13T17:16:04,893][INFO ][logstash.agent           ] No persistent UUID file found. Generating new UUID {:uuid=>"224578fc-fa55-4310-ac03-7b61fd02f7a8", :path=>"/usr/share/logstash/data/uuid"}
[2020-06-13T17:16:06,452][ERROR][logstash.agent           ] Failed to execute action {:action=>LogStash::PipelineAction::Create/pipeline_id:main, :exception=>"LogStash::ConfigurationError", :message=>"Expected one of [ \\t\\r\\n], \"#\", [A-Za-z0-9_-], '\"', \"'\", [A-Za-z_], \"-\", [0-9], \"[\", \"{\" at line 19, column 12 (byte 387) after output {\r\n file {\r\n   path => ", :backtrace=>["/usr/share/logstash/logstash-core/lib/logstash/compiler.rb:58:in `compile_imperative'", "/usr/share/logstash/logstash-core/lib/logstash/compiler.rb:66:in `compile_graph'", "/usr/share/logstash/logstash-core/lib/logstash/compiler.rb:28:in `block in compile_sources'", "org/jruby/RubyArray.java:2577:in `map'", "/usr/share/logstash/logstash-core/lib/logstash/compiler.rb:27:in `compile_sources'", "org/logstash/execution/AbstractPipelineExt.java:181:in `initialize'", "org/logstash/execution/JavaBasePipelineExt.java:67:in `initialize'", "/usr/share/logstash/logstash-core/lib/logstash/java_pipeline.rb:43:in `initialize'", "/usr/share/logstash/logstash-core/lib/logstash/pipeline_action/create.rb:52:in `execute'", "/usr/share/logstash/logstash-core/lib/logstash/agent.rb:342:in `block in converge_state'"]}
warning: thread "Api Webserver" terminated with exception (report_on_exception is true):
SocketError: initialize: name or service not known
        initialize at org/jruby/ext/socket/RubyTCPServer.java:124
               new at org/jruby/RubyIO.java:876
  add_tcp_listener at /usr/share/logstash/vendor/bundle/jruby/2.5.0/gems/puma-4.3.3-java/lib/puma/binder.rb:229
  add_tcp_listener at uri:classloader:/META-INF/jruby.home/lib/ruby/stdlib/forwardable.rb:229
   start_webserver at /usr/share/logstash/logstash-core/lib/logstash/webserver.rb:104
               run at /usr/share/logstash/logstash-core/lib/logstash/webserver.rb:60
              each at org/jruby/RubyRange.java:526
   each_with_index at org/jruby/RubyEnumerable.java:1258
               run at /usr/share/logstash/logstash-core/lib/logstash/webserver.rb:55
   start_webserver at /usr/share/logstash/logstash-core/lib/logstash/agent.rb:393
[2020-06-13T17:16:06,817][ERROR][org.logstash.Logstash    ] java.lang.IllegalStateException: Logstash stopped processing because of an error: (SocketError) initialize: name or service not known
1 Like

have you tried wrapping the path in double quotes ?

1 Like

That's probably one of may issues.

After running it again with new conf file (with double quotes, and port 443 instead of default 8080):

input {
  #http_poller {
  #  urls => {
  #    myresource => "https://www.marketwatch.com/investing/index/djia"
  #    }
  #  }
    #request_timeout => 60
    #interval => 60
    #codec => "json" # set this if the response is json formatted
  http {
    host => "https://www.marketwatch.com/investing/index/djia"
    port => 443
    id => "dow_id"
  }
}


output {
 file {
   path => "/usr/share/logstash/data/DOW.log"
   #codec => line { format => "custom format: %{message}"}
 }
}

logstash -f dow.conf output:

OpenJDK 64-Bit Server VM warning: Option UseConcMarkSweepGC was deprecated in version 9.0 and will likely be removed in a future release.
WARNING: An illegal reflective access operation has occurred
WARNING: Illegal reflective access by com.headius.backport9.modules.Modules (file:/usr/share/logstash/logstash-core/lib/jars/jruby-complete-9.2.11.1.jar) to method sun.nio.ch.NativeThread.signal(long)
WARNING: Please consider reporting this to the maintainers of com.headius.backport9.modules.Modules
WARNING: Use --illegal-access=warn to enable warnings of further illegal reflective access operations
WARNING: All illegal access operations will be denied in a future release
Sending Logstash logs to /usr/share/logstash/logs which is now configured via log4j2.properties
[2020-06-14T19:47:23,588][INFO ][logstash.setting.writabledirectory] Creating directory {:setting=>"path.queue", :path=>"/usr/share/logstash/data/queue"}
[2020-06-14T19:47:23,615][INFO ][logstash.setting.writabledirectory] Creating directory {:setting=>"path.dead_letter_queue", :path=>"/usr/share/logstash/data/dead_letter_queue"}
[2020-06-14T19:47:24,189][WARN ][logstash.config.source.multilocal] Ignoring the 'pipelines.yml' file because modules or command line options are specified
[2020-06-14T19:47:24,193][INFO ][logstash.runner          ] Starting Logstash {"logstash.version"=>"7.7.0"}
[2020-06-14T19:47:24,249][INFO ][logstash.agent           ] No persistent UUID file found. Generating new UUID {:uuid=>"41f66ac0-1bfa-441f-a265-0b2816d4a9b1", :path=>"/usr/share/logstash/data/uuid"}
[2020-06-14T19:47:26,499][INFO ][org.reflections.Reflections] Reflections took 45 ms to scan 1 urls, producing 21 keys and 41 values
[2020-06-14T19:47:27,160][WARN ][org.logstash.instrument.metrics.gauge.LazyDelegatingGauge][main] A gauge metric of an unknown type (org.jruby.RubyArray) has been created for key: cluster_uuids. This may result in invalid serialization.  It is recommended to log an issue to the responsible developer/development team.
[2020-06-14T19:47:27,166][INFO ][logstash.javapipeline    ][main] Starting pipeline {:pipeline_id=>"main", "pipeline.workers"=>2, "pipeline.batch.size"=>125, "pipeline.batch.delay"=>50, "pipeline.max_inflight"=>250, "pipeline.sources"=>["/usr/share/logstash/pipeline/dow.conf"], :thread=>"#<Thread:0x5bbbd869 run>"}
[2020-06-14T19:47:28,291][INFO ][logstash.javapipeline    ][main] Pipeline started {"pipeline.id"=>"main"}
[2020-06-14T19:47:28,331][INFO ][logstash.inputs.http     ][main][dow_id] Starting http input listener {:address=>"https://www.marketwatch.com/investing/index/djia:443", :ssl=>"false"}
[2020-06-14T19:47:28,372][INFO ][logstash.agent           ] Pipelines running {:count=>1, :running_pipelines=>[:main], :non_running_pipelines=>[]}
[2020-06-14T19:47:28,556][ERROR][logstash.javapipeline    ][main][dow_id] A plugin had an unrecoverable error. Will restart this plugin.
  Pipeline_id:main
  Plugin: <LogStash::Inputs::Http host=>"https://www.marketwatch.com/investing/index/djia", id=>"dow_id", port=>443, enable_metric=>true, codec=><LogStash::Codecs::Plain id=>"plain_691537b5-2871-4332-b0f7-0a3c9a8116aa", enable_metric=>true, charset=>"UTF-8">, ssl=>false, ssl_verify_mode=>"none", ssl_handshake_timeout=>10000, tls_min_version=>1, tls_max_version=>1.2, cipher_suites=>["TLS_ECDHE_ECDSA_WITH_AES_256_GCM_SHA384", "TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384", "TLS_ECDHE_ECDSA_WITH_AES_128_GCM_SHA256", "TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256", "TLS_ECDHE_ECDSA_WITH_AES_256_CBC_SHA384", "TLS_ECDHE_RSA_WITH_AES_256_CBC_SHA384", "TLS_ECDHE_RSA_WITH_AES_128_CBC_SHA256", "TLS_ECDHE_ECDSA_WITH_AES_128_CBC_SHA256"], additional_codecs=>{"application/json"=>"json"}, response_headers=>{"Content-Type"=>"text/plain"}, remote_host_target_field=>"host", request_headers_target_field=>"headers", threads=>2, max_pending_requests=>200, max_content_length=>104857600, response_code=>200, verify_mode=>"none">
  Error:
  Exception: Java::JavaNioChannels::UnresolvedAddressException
  Stack: sun.nio.ch.Net.checkAddress(sun/nio/ch/Net.java:130)
sun.nio.ch.ServerSocketChannelImpl.bind(sun/nio/ch/ServerSocketChannelImpl.java:222)
io.netty.channel.socket.nio.NioServerSocketChannel.doBind(io/netty/channel/socket/nio/NioServerSocketChannel.java:130)
io.netty.channel.AbstractChannel$AbstractUnsafe.bind(io/netty/channel/AbstractChannel.java:558)
io.netty.channel.DefaultChannelPipeline$HeadContext.bind(io/netty/channel/DefaultChannelPipeline.java:1358)
io.netty.channel.AbstractChannelHandlerContext.invokeBind(io/netty/channel/AbstractChannelHandlerContext.java:501)
io.netty.channel.AbstractChannelHandlerContext.bind(io/netty/channel/AbstractChannelHandlerContext.java:486)
io.netty.channel.DefaultChannelPipeline.bind(io/netty/channel/DefaultChannelPipeline.java:1019)
io.netty.channel.AbstractChannel.bind(io/netty/channel/AbstractChannel.java:254)
io.netty.bootstrap.AbstractBootstrap$2.run(io/netty/bootstrap/AbstractBootstrap.java:366)
io.netty.util.concurrent.AbstractEventExecutor.safeExecute(io/netty/util/concurrent/AbstractEventExecutor.java:163)
io.netty.util.concurrent.SingleThreadEventExecutor.runAllTasks(io/netty/util/concurrent/SingleThreadEventExecutor.java:404)
io.netty.channel.nio.NioEventLoop.run(io/netty/channel/nio/NioEventLoop.java:462)
io.netty.util.concurrent.SingleThreadEventExecutor$5.run(io/netty/util/concurrent/SingleThreadEventExecutor.java:897)
java.lang.Thread.run(java/lang/Thread.java:834)
warning: thread "Api Webserver" terminated with exception (report_on_exception is true):
SocketError: initialize: name or service not known
        initialize at org/jruby/ext/socket/RubyTCPServer.java:124
               new at org/jruby/RubyIO.java:876
  add_tcp_listener at /usr/share/logstash/vendor/bundle/jruby/2.5.0/gems/puma-4.3.3-java/lib/puma/binder.rb:229
  add_tcp_listener at uri:classloader:/META-INF/jruby.home/lib/ruby/stdlib/forwardable.rb:229
   start_webserver at /usr/share/logstash/logstash-core/lib/logstash/webserver.rb:104
               run at /usr/share/logstash/logstash-core/lib/logstash/webserver.rb:60
              each at org/jruby/RubyRange.java:526
   each_with_index at org/jruby/RubyEnumerable.java:1258
               run at /usr/share/logstash/logstash-core/lib/logstash/webserver.rb:55
   start_webserver at /usr/share/logstash/logstash-core/lib/logstash/agent.rb:393
[2020-06-14T19:47:28,815][FATAL][logstash.runner          ] An unexpected error occurred! {:error=>#<SocketError: initialize: name or service not known>, :backtrace=>["org/jruby/ext/socket/RubyTCPServer.java:124:in `initialize'", "org/jruby/RubyIO.java:876:in `new'", "/usr/share/logstash/vendor/bundle/jruby/2.5.0/gems/puma-4.3.3-java/lib/puma/binder.rb:229:in `add_tcp_listener'", "/usr/share/logstash/logstash-core/lib/logstash/webserver.rb:104:in `start_webserver'", "/usr/share/logstash/logstash-core/lib/logstash/webserver.rb:60:in `block in run'", "org/jruby/RubyRange.java:526:in `each'", "org/jruby/RubyEnumerable.java:1258:in `each_with_index'", "/usr/share/logstash/logstash-core/lib/logstash/webserver.rb:55:in `run'", "/usr/share/logstash/logstash-core/lib/logstash/agent.rb:393:in `block in start_webserver'"]}
[2020-06-14T19:47:28,831][ERROR][org.logstash.Logstash    ] java.lang.IllegalStateException: Logstash stopped processing because of an error: (SystemExit) exit
1 Like

An http input effectively server listening on a port on your machine to which you can send HTTP requests. It looks like you want logstash to send a request to another site (www.marketwatch.com). You should use an http_poller input for that.

1 Like

after some fidgeting with the .ymls and the http_poller (apparently I was using an outdated version of the documentation for it), I was able to finally get a response. Unfortunately, that response was a captcha, so I'll have to find some other source.

Thank you for all the help.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.