Unable to connect Logstash with Elasticsearch

Hello,

I have installed Logstash, Elasticsearch and Kibana on my CentOS 7 machine. I was able to do the installations successfully. But when I try to access the apache access logs, it doesn't get the output. here is what I have done.

I've created a file sudo vi /etc/logstash/conf.d/01-webserver.conf

Added the following code in the file to access the logs:

input
{
file
{
path => "/var/log/httpd/access_log"
start_position => "beginning"
}
}
filter
{
if [type] == "apache-access"
{
grok
{
match => { "message" => "%{COMBINEDAPACHELOG}" }
}
}
date
{
match => [ "timestamp" , "dd/MMM/yyyy:HH:mm:ss Z" ]
}
}
output
{
elasticsearch
{
hosts => ["127.0.0.1:9200"]
}
stdout { codec => rubydebug }
}

Then, I run this command to list the logstash indexes:

curl -XGET http://127.0.0.1:9200/_cat/indices?v

It gives me this:

health status index pri rep docs.count docs.deleted store.size pri.store.size
yellow open .kibana 1 1 1 0 3.1kb 3.1kb

It doesn't show any log related logstash indexes here.
can you please help!!

This is hard to read. A little indentation goes a long way to improve readability.

You can also embed your code between two lines with triple back-ticks, like this:

```
YOUR CODE PASTED HERE
```

When I try to reformat your config, I get this:

input {
  file {
    path => "/var/log/httpd/access_log"
    start_position => "beginning"
  }
}

filter {
  if [type] == "apache-access" {
    grok {
      match => { "message" => "%{COMBINEDAPACHELOG}" }
    }
  }
  date {
    match => [ "timestamp" , "dd/MMM/yyyy:HH:mm:ss Z" ]
  }
}

output {
  elasticsearch {
    hosts => ["127.0.0.1:9200"]
  }
  stdout { codec => rubydebug }
}

The first thing I notice is that you're testing for [type], but you haven't assigned a type anywhere.

Perhaps you could add type => "apache-access" to your file plugin block:

  file {
    path => "/var/log/httpd/access_log"
    start_position => "beginning"
    type => "apache-access"
  }

Also, you will need to remove the sincedb for the file you're tailing. Just because it says, start_position => "beginning" doesn't mean it will repeatedly reinvest the file from the start. It does so the first time, but then it remembers where it left off, and resumes from there. If you're running a 5.x version of Logstash, and installed via RPM or DEB, then look in /var/lib/logstash/input/file for sincedb files, and delete them. Otherwise, they will be in $LS_HOME/data/input/file, if memory serves.

Hey Aaron,

Thanks for your response and sorry for the trouble in reading the code.
I did add type => "apache-access", I'm not able to find the sincedb files to delete.
After adding the type I've run the curl command and it still gives me the same output.

What version of Logstash are you using? We need to find those sincedb files, or you will not be able to re-read the apache log files.

Also, how are you starting Logstash? You have stdout output plugin, but this is not useful unless you're starting Logstash manually at the command-line.

My Logstash version is 2.4.1. I'm starting Logstash manually from the terminal.

If that's the case, then your sincedb entries are likely to be in $HOME/.sincedb/ (for whomever the user is that is launching Logstash).

Hello Sir,

I am so sorry, but I don't find any sincedb files.

You have the option to specify a sincedb path:

input {
  file {
    path => "/var/log/httpd/access_log"
    start_position => "beginning"
    sincedb_path => "/home/user/mysincedb"
  }
}

This provides two benefits.

  1. This is a new sincedb file, so it should read from the "beginning" again.
  2. You will know which file to erase between tests.

I did it but still unable to locate the sincedb files. So I've re-installed logstash, elasticsearch and kibana. It's the same issue again, unable to see the indexes. When I go check var/log/logstash, I see these errors:

{:timestamp=>"2017-04-20T08:54:35.379000-0400", :message=>"fetched an invalid config", :config=>"input\n{\nfile\n{\npath => "/var/log/httpd/access_log"\nstart_position => "beginning"\n}\n}\nfilter \n{\nif [type] == "apache-access"\n{\ngrok\n{\nmatch => { "message" => "%{COMBINEDAPACHELOG}" }\n}\n}\ndate\n{\nmatch => [ "timestamp" , "dd/MMM/yyyy:HH:mm:ss Z" ]\n}\n}\noutput\n{\nelasticsearch\n{\nhosts => ["localhost:9200"]\n}\nstdout { codec => rubydebug }\n}\n\n\ninput {\n stdin {}\n}\noutput {\n stdout {}\n}\n\ninput {\nfile {\npath => [ "/var/log/*.log", "/var/log/messages", "/var/log/syslog" ]\ntype => "syslog"\n}\n}\n\noutput {\nelasticsearch { host => 127.0.0.1 }\nstdout { codec => rubydebug }\n}\n\n", :reason=>"Expected one of #, } at line 48, column 30 (byte 526) after output {\nelasticsearch { host => 127.0", :level=>:error}

{:timestamp=>"2017-04-20T09:14:16.449000-0400", :message=>"fetched an invalid config", :config=>"input\n{\nfile\n{\npath => "/var/log/httpd/access_log"\nstart_position => "beginning"\n}\n}\nfilter \n{\nif [type] == "apache-access"\n{\ngrok\n{\nmatch => { "message" => "%{COMBINEDAPACHELOG}" }\n}\n}\ndate\n{\nmatch => [ "timestamp" , "dd/MMM/yyyy:HH:mm:ss Z" ]\n}\n}\noutput\n{\nelasticsearch\n{\nhosts => ["localhost:9200"]\n}\nstdout { codec => rubydebug }\n}\n\n\ninput {\n stdin {}\n}\noutput {\n stdout {}\n}\n\ninput {\nfile {\npath => [ "/var/log/*.log", "/var/log/messages", "/var/log/syslog" ]\ntype => "syslog"\n}\n}\n\noutput {\nelasticsearch { host => 127.0.0.1 }\nstdout { codec => rubydebug }\n}\n\n", :reason=>"Expected one of #, } at line 48, column 30 (byte 526) after output {\nelasticsearch { host => 127.0", :level=>:error}

{:timestamp=>"2017-04-20T09:43:13.407000-0400", :message=>"fetched an invalid config", :config=>"input\n{\nfile\n{\npath => "/var/log/httpd/access_log"\ntype => "apache-access"\nstart_position => "beginning"\n}\n}\nfilter \n{\nif [type] == "apache-access"\n{\ngrok\n{\nmatch => { "message" => "%{COMBINEDAPACHELOG}" }\n}\n}\ndate\n{\nmatch => [ "timestamp" , "dd/MMM/yyyy:HH:mm:ss Z" ]\n}\n}\noutput\n{\nelasticsearch\n{\nhosts => ["localhost:9200"]\n}\nstdout { codec => rubydebug }\n}\n\n\ninput {\n stdin {}\n}\noutput {\n stdout {}\n}\n\ninput {\nfile {\npath => [ "/var/log/*.log", "/var/log/messages", "/var/log/syslog" ]\ntype => "syslog"\n}\n}\n\noutput {\nelasticsearch { host => 127.0.0.1 }\nstdout { codec => rubydebug }\n}\n\n", :reason=>"Expected one of #, } at line 49, column 30 (byte 550) after output {\nelasticsearch { host => 127.0", :level=>:error}

Can you please look into it.

I can't help you if you won't do what I suggest.

The error indicates that your new configuration file is missing something, or is otherwise misconfigured. But besides this, you have ignored my suggestion to manually specify the sincedb_path, as is clear from the configuration I found:

:message=>"fetched an invalid config", :config=>"input\n{\nfile\n{\npath => \"/var/log/httpd/access_log\"\nstart_position => \"beginning\"\n}\n}\n

Which breaks down to:

input {
  file {
    path => "/var/log/httpd/access_log"
    start_position => "beginning"
  }
}

It also seems (though it's hard to tell for sure) you are still not indenting your configuration, which makes it extremely hard to read, especially when you are looking for a misconfiguration.

I can't help you if you don't try to do the things I recommend.

Hello sir, I already did manually specify the sincedb_path and as it didn't work I had to start from scratch. I am here to seek help. And it is an error file which I have sent you, not my config file. My config file is indented. Here is my config file:

input {
file {
path => "/var/log/httpd/access_log"
start_position => "beginning"
sincedb_path => "/home/likhita/mysincedb"
}
}

filter {
if [type] == "apache-access"
{
grok {
match => { "message" => "%{COMBINEDAPACHELOG}" }
}
}
date {
match => [ "timestamp" , "dd/MMM/yyyy:HH:mm:ss Z" ]
}
}

output {
elasticsearch {
hosts => ["localhost:9200"]
}
stdout { codec => rubydebug }
}

Sorry If I did something wrong!!

When I paste it here my indentation goes off!!!

Here it is!!

```
YOUR CODE PASTED HERE
```

If you paste your code between triple back-ticks, as I mentioned early in this thread, it will not lose formatting.

I need you to do this, because I cannot copy and paste a screen-shot.

input {
  file {
    path => "/var/log/httpd/access_log"
    start_position => "beginning"
    sincedb_path => "/home/likhita/mysincedb"
  }
}

filter {
if [type] == "apache-access"
{
    grok {
      match => { "message" => "%{COMBINEDAPACHELOG}" }
    }
  }
  date {
    match => [ "timestamp" , "dd/MMM/yyyy:HH:mm:ss Z" ]
  }
}

output {
  elasticsearch {
    hosts => ["localhost:9200"]
  }
  stdout { codec => rubydebug }
}

Thank you. That's much more readable, and I was able to cut/paste it and test it on my laptop. I changed these settings to fit my paths:

    path => "/Users/buh/test_access_log"
    sincedb_path => "/Users/buh/mysincedb"

I spun up a local elasticsearch 5.3.0 instance, so it would match what is in the config. I put a single line into the /Users/buh/test_access_log file:

177.140.155.68 - - [24/Apr/2017:13:37:50 +0000] "GET / HTTP/1.0" 200 612 "-" "Wget(linux)"

Everything else is exactly as you have it, stored in a file I named test.conf. I ran:

bin/logstash -f test.conf --debug

and this is part of what I saw in the output:

[2017-04-24T13:29:43,513][INFO ][logstash.pipeline        ] Pipeline main started
[2017-04-24T13:29:43,516][DEBUG][logstash.inputs.file     ] _globbed_files: /Users/buh/test_access_log: glob is: ["/Users/buh/test_access_log"]
[2017-04-24T13:29:43,517][DEBUG][logstash.inputs.file     ] _discover_file: /Users/buh/test_access_log: new: /Users/buh/test_access_log (exclude is [])
[2017-04-24T13:29:43,518][DEBUG][logstash.inputs.file     ] _open_file: /Users/buh/test_access_log: opening
[2017-04-24T13:29:43,518][DEBUG][logstash.inputs.file     ] /Users/buh/test_access_log: initial create, no sincedb, seeking to beginning of file
[2017-04-24T13:29:43,518][DEBUG][logstash.inputs.file     ] Received line {:path=>"/Users/buh/test_access_log", :text=>"177.140.155.68 - - [24/Apr/2017:13:37:50 +0000] \"GET / HTTP/1.0\" 200 612 \"-\" \"Wget(linux)\""}
[2017-04-24T13:29:43,519][DEBUG][logstash.agent           ] Starting puma
[2017-04-24T13:29:43,521][DEBUG][logstash.agent           ] Trying to start WebServer {:port=>9600}
[2017-04-24T13:29:43,522][DEBUG][logstash.api.service     ] [api-service] start
[2017-04-24T13:29:43,535][DEBUG][logstash.inputs.file     ] writing sincedb (delta since last write = 1493062183)
[2017-04-24T13:29:43,542][DEBUG][logstash.pipeline        ] filter received {"event"=>{"path"=>"/Users/buh/test_access_log", "@timestamp"=>2017-04-24T19:29:43.533Z, "@version"=>"1", "host"=>"localhost.local", "message"=>"177.140.155.68 - - [24/Apr/2017:13:37:50 +0000] \"GET / HTTP/1.0\" 200 612 \"-\" \"Wget(linux)\""}}
[2017-04-24T13:29:43,545][DEBUG][logstash.pipeline        ] output received {"event"=>{"path"=>"/Users/buh/test_access_log", "@timestamp"=>2017-04-24T19:29:43.533Z, "@version"=>"1", "host"=>"localhost.local", "message"=>"177.140.155.68 - - [24/Apr/2017:13:37:50 +0000] \"GET / HTTP/1.0\" 200 612 \"-\" \"Wget(linux)\""}}
[2017-04-24T13:29:43,556][INFO ][logstash.agent           ] Successfully started Logstash API endpoint {:port=>9600}
{
          "path" => "/Users/buh/test_access_log",
    "@timestamp" => 2017-04-24T19:29:43.533Z,
      "@version" => "1",
          "host" => "localhost.local",
       "message" => "177.140.155.68 - - [24/Apr/2017:13:37:50 +0000] \"GET / HTTP/1.0\" 200 612 \"-\" \"Wget(linux)\""
}
[2017-04-24T13:29:48,519][DEBUG][logstash.pipeline        ] Pushing flush onto pipeline
^C[2017-04-24T13:29:53,288][WARN ][logstash.runner          ] SIGINT received. Shutting down the agent.

As you can see, I hit ^C there at the end to kill Logstash. My Elasticsearch logged this:

[2017-04-24T13:29:43,605][INFO ][o.e.c.m.MetaDataCreateIndexService] [yZW_CPx] [logstash-2017.04.24] creating index, cause [auto(bulk api)], templates [logstash], shards [5]/[1], mappings [_default_]
[2017-04-24T13:29:43,793][INFO ][o.e.c.m.MetaDataMappingService] [yZW_CPx] [logstash-2017.04.24/T8t6Pts3QSaVDZvoZw_JKg] create_mapping [logs]

When I queried Elasticsearch, I found the line I expected:

$ curl -XGET http://localhost:9200/logstash-2017.04.24/_search?pretty -d '
{
  "query": {
    "match_all": {}
  }
}
'
{
  "took" : 3,
  "timed_out" : false,
  "_shards" : {
    "total" : 5,
    "successful" : 5,
    "failed" : 0
  },
  "hits" : {
    "total" : 1,
    "max_score" : 1.0,
    "hits" : [
      {
        "_index" : "logstash-2017.04.24",
        "_type" : "logs",
        "_id" : "AVuhcOtGokrJPubvTV9I",
        "_score" : 1.0,
        "_source" : {
          "path" : "/Users/buh/test_access_log",
          "@timestamp" : "2017-04-24T19:29:43.533Z",
          "@version" : "1",
          "host" : "localhost.local",
          "message" : "177.140.155.68 - - [24/Apr/2017:13:37:50 +0000] \"GET / HTTP/1.0\" 200 612 \"-\" \"Wget(linux)\""
        }
      }
    ]
  }
}

As you can also see in the above Logstash debug output, it updated the sincedb file:

$ cat ~/mysincedb
58694687 1 4 91

Since it's clear you wanted your grok rule to work, I took the liberty of adding type => "apache-access" to your config, so the conditional if statement would have something to match against:

input {
  file {
    path => "/Users/buh/test_access_log"
    start_position => "beginning"
    sincedb_path => "/Users/buh/mysincedb"
    type => "apache-access"
  }
}

Now, I delete the sincedb file:

rm /Users/buh/mysincedb

(I also deleted the index in Elasticsearch so it would be a fresh start)

Now, when I re-run bin/logstash -f test.conf --debug, I see what you were probably expecting:

[2017-04-24T13:41:50,598][INFO ][logstash.pipeline        ] Starting pipeline {"id"=>"main", "pipeline.workers"=>8, "pipeline.batch.size"=>125, "pipeline.batch.delay"=>5, "pipeline.max_inflight"=>1000}
[2017-04-24T13:41:50,703][INFO ][logstash.pipeline        ] Pipeline main started
[2017-04-24T13:41:50,705][DEBUG][logstash.inputs.file     ] _globbed_files: /Users/buh/test_access_log: glob is: ["/Users/buh/test_access_log"]
[2017-04-24T13:41:50,706][DEBUG][logstash.inputs.file     ] _discover_file: /Users/buh/test_access_log: new: /Users/buh/test_access_log (exclude is [])
[2017-04-24T13:41:50,707][DEBUG][logstash.inputs.file     ] _open_file: /Users/buh/test_access_log: opening
[2017-04-24T13:41:50,707][DEBUG][logstash.inputs.file     ] /Users/buh/test_access_log: initial create, no sincedb, seeking to beginning of file
[2017-04-24T13:41:50,707][DEBUG][logstash.inputs.file     ] Received line {:path=>"/Users/buh/test_access_log", :text=>"177.140.155.68 - - [24/Apr/2017:13:37:50 +0000] \"GET / HTTP/1.0\" 200 612 \"-\" \"Wget(linux)\""}
[2017-04-24T13:41:50,714][DEBUG][logstash.agent           ] Starting puma
[2017-04-24T13:41:50,715][DEBUG][logstash.agent           ] Trying to start WebServer {:port=>9600}
[2017-04-24T13:41:50,716][DEBUG][logstash.api.service     ] [api-service] start
[2017-04-24T13:41:50,726][DEBUG][logstash.inputs.file     ] writing sincedb (delta since last write = 1493062910)
[2017-04-24T13:41:50,732][DEBUG][logstash.pipeline        ] filter received {"event"=>{"path"=>"/Users/buh/test_access_log", "@timestamp"=>2017-04-24T19:41:50.724Z, "@version"=>"1", "host"=>"localhost.local", "message"=>"177.140.155.68 - - [24/Apr/2017:13:37:50 +0000] \"GET / HTTP/1.0\" 200 612 \"-\" \"Wget(linux)\"", "type"=>"apache-access"}}
[2017-04-24T13:41:50,733][DEBUG][logstash.filters.grok    ] Running grok filter {:event=>2017-04-24T19:41:50.724Z localhost.local 177.140.155.68 - - [24/Apr/2017:13:37:50 +0000] "GET / HTTP/1.0" 200 612 "-" "Wget(linux)"}
[2017-04-24T13:41:50,739][DEBUG][logstash.filters.grok    ] Event now:  {:event=>2017-04-24T19:41:50.724Z localhost.local 177.140.155.68 - - [24/Apr/2017:13:37:50 +0000] "GET / HTTP/1.0" 200 612 "-" "Wget(linux)"}
[2017-04-24T13:41:50,746][DEBUG][logstash.pipeline        ] output received {"event"=>{"request"=>"/", "agent"=>"\"Wget(linux)\"", "auth"=>"-", "ident"=>"-", "verb"=>"GET", "message"=>"177.140.155.68 - - [24/Apr/2017:13:37:50 +0000] \"GET / HTTP/1.0\" 200 612 \"-\" \"Wget(linux)\"", "type"=>"apache-access", "path"=>"/Users/buh/test_access_log", "referrer"=>"\"-\"", "@timestamp"=>2017-04-24T13:37:50.000Z, "response"=>"200", "bytes"=>"612", "clientip"=>"177.140.155.68", "@version"=>"1", "host"=>"localhost.local", "httpversion"=>"1.0", "timestamp"=>"24/Apr/2017:13:37:50 +0000"}}
[2017-04-24T13:41:50,747][INFO ][logstash.agent           ] Successfully started Logstash API endpoint {:port=>9600}
{
        "request" => "/",
          "agent" => "\"Wget(linux)\"",
           "auth" => "-",
          "ident" => "-",
           "verb" => "GET",
        "message" => "177.140.155.68 - - [24/Apr/2017:13:37:50 +0000] \"GET / HTTP/1.0\" 200 612 \"-\" \"Wget(linux)\"",
           "type" => "apache-access",
           "path" => "/Users/buh/test_access_log",
       "referrer" => "\"-\"",
     "@timestamp" => 2017-04-24T13:37:50.000Z,
       "response" => "200",
          "bytes" => "612",
       "clientip" => "177.140.155.68",
       "@version" => "1",
           "host" => "localhost.local",
    "httpversion" => "1.0",
      "timestamp" => "24/Apr/2017:13:37:50 +0000"
}
^C[2017-04-24T13:41:53,503][WARN ][logstash.runner          ] SIGINT received. Shutting down the agent.

And again, when queried from Elasticsearch:

$ curl -XGET http://localhost:9200/logstash-2017.04.24/_search?pretty -d '
{
  "query": {
    "match_all": {}
  }
}
'
{
  "took" : 29,
  "timed_out" : false,
  "_shards" : {
    "total" : 5,
    "successful" : 5,
    "failed" : 0
  },
  "hits" : {
    "total" : 1,
    "max_score" : 1.0,
    "hits" : [
      {
        "_index" : "logstash-2017.04.24",
        "_type" : "apache-access",
        "_id" : "AVuhfAPwRx488yJUxuVA",
        "_score" : 1.0,
        "_source" : {
          "request" : "/",
          "agent" : "\"Wget(linux)\"",
          "auth" : "-",
          "ident" : "-",
          "verb" : "GET",
          "message" : "177.140.155.68 - - [24/Apr/2017:13:37:50 +0000] \"GET / HTTP/1.0\" 200 612 \"-\" \"Wget(linux)\"",
          "type" : "apache-access",
          "path" : "/Users/buh/test_access_log",
          "referrer" : "\"-\"",
          "@timestamp" : "2017-04-24T13:37:50.000Z",
          "response" : "200",
          "bytes" : "612",
          "clientip" : "177.140.155.68",
          "@version" : "1",
          "host" : "localhost.local",
          "httpversion" : "1.0",
          "timestamp" : "24/Apr/2017:13:37:50 +0000"
        }
      }
    ]
  }
}

Everything works, and /Users/buh/mysincedb is again updated, per the log file:

$ cat /Users/buh/mysincedb
58694687 1 4 91

Thank you so much for your help!!! I will follow the same steps and see if everything works fine.

I followed the procedure that you have suggested and well it turned out to be working fine now. I am able to see the logstash indices when I run the curl command. Thank you so much for your help!!! Appreciate it.