Logstash creates a new index every 24 hours

Hi all,

I am using https://openweathermap.org/ current weather API to display current weather data.

I was able successfully visualize the data on Kibana after it created a new index weather-2018.07.18.

Here is my logstash config file.

input {
  http_poller {
    urls => {
        weather => {
            url => "http://api.openweathermap.org/data/2.5/weather?id=5490223&appid=MY_APP_ID&units=metric"
            headers => {
              Accept => "application/json"
            }
        }
    }
    schedule => { cron => "* */2 * * * *" }
    codec => json
  }
}
filter {
  mutate {
    remove_field => ["@version" ,"command" ,"host" ,"cod" ,"id" ,"base" ,"coord" ,"sys" ,"dt"]
  }
  split { field => "weather" }
}
output {
  elasticsearch {
    hosts => ["localhost:9200"]
    index => "weather-%{+YYYY.MM.dd}"
  }
  stdout {
    codec => rubydebug
  }
}

However, after 24 hours a new index weather-2018.07.19 was created on elasticsearch.

health status index                           uuid                   pri rep docs.count docs.deleted store.size pri.store.size
green  open   .kibana                         ZB7x3FQyQlyRvkRIaEGi2g   1   0          5            0     32.7kb         32.7kb
green  open   .monitoring-kibana-6-2018.07.12 8iFI69JhQL-bLWsGZzFQYA   1   0       2370            0    697.8kb        697.8kb
green  open   .monitoring-es-6-2018.07.12     4-XcOMWsRkW7_enzqslTQg   1   0      34226          115     12.7mb         12.7mb
yellow open   logstash-2018.07.13             _oiQOnpMQ4KQcHssK8PmxA   5   1       9999            0        7mb            7mb
green  open   .monitoring-es-6-2018.07.18     h_ylFE80RS6153Sx5A7o6A   1   0      51522          175     23.4mb         23.4mb
green  open   .monitoring-es-6-2018.07.15     27pTcx8TTDqchfQgo7WQ-w   1   0        234            0    160.2kb        160.2kb
green  open   .monitoring-kibana-6-2018.07.16 Lmydu8eOS8-p6aKPP7SUqw   1   0       1383            0    465.1kb        465.1kb
green  open   .monitoring-es-6-2018.07.17     HThfGRdATO-C455Xm_-e9Q   1   0      49107          175     18.1mb         18.1mb
green  open   .monitoring-kibana-6-2018.07.17 u1EedigZSWeXOaOwdueEdQ   1   0       2450            0    700.7kb        700.7kb
green  open   .monitoring-kibana-6-2018.07.14 17TgPEalTAOJRe3gb59eZA   1   0         15            0     87.5kb         87.5kb
green  open   .monitoring-es-6-2018.07.19     YbmtJ2qLSNSAJIEluEz3Zw   1   0       9304          235      9.9mb          9.9mb
green  open   .monitoring-es-6-2018.07.16     3iZspMNoQUuty5lYM5ylvg   1   0      39028          135     17.3mb         17.3mb
yellow open   weather-2018.07.19              M7b9ojh3Qc631m8NqlZOzA   5   1       1364            0    348.7kb        348.7kb
green  open   .monitoring-kibana-6-2018.07.19 br9oBb0ISgi5x-ZkDb7Hnw   1   0        372            0    346.5kb        346.5kb
green  open   .monitoring-es-6-2018.07.13     1FBGvkSzR4yf7WcyrwN3LQ   1   0      51214           50     19.2mb         19.2mb
yellow open   weather-2018.07.18              EAxPkKAPQceMEH9K2u_EEQ   5   1       4378            0    731.1kb        731.1kb
green  open   .monitoring-kibana-6-2018.07.18 ZnYApUXKQbiH9CVFRqmzdw   1   0       2463            0    742.6kb        742.6kb
green  open   .monitoring-kibana-6-2018.07.15 9TZ9csgUQp-LrAbRYT4aGw   1   0          9            0    127.5kb        127.5kb
green  open   .monitoring-kibana-6-2018.07.13 Cpt83ctnQiqLWiydy1nmUw   1   0       2585            0    830.1kb        830.1kb
green  open   .monitoring-es-6-2018.07.14     fuXHKIMNS5u15-XR0dvrTw   1   0        380          162    575.3kb        575.3kb
yellow open   current_weather-2018.07.19      Xm-WY_9oTQOP1gXk9k3T9Q   5   1          1            0       460b           460b

I'm guessing this has something to do with the index configuration option. How do I make sure that there is only one index pattern for logstash that keeps pulling live weather. Also, how do I configure http_poller plugin to ensure it keeps pulling data every second? Would really appreciate if you could help me. Thanks in advance!

Remove the -%{+YYYY.MM.dd} from the index option on your elasticsearch output.

You mean you want it to fetch the current weather once a second? I don't think the OpenWeather folks will be happy if you do that.

It is possible to do it. I read the documentation, and if I am not wrong, I can make 60 requests per minute.

Thanks! :slight_smile:

Also, if I am not wrong about the API calling, how do I schedule the cron to make a request every second?

Remove the /2 from the schedule option in your input.

Thanks a lot! Also, Is there a way to alter the value of live API data polled using http_poller input plugin? I want to modify the data retrieved from openweathermap every 30 seconds before sending it into elasticsearch. Is there a way to do it?

I would really appreciate if someone could help me out. Thanks!

Yes. logstash has a wide variety of filter plugins that can manipulate the data, including a ruby plugin that can do pretty much anything.

What does the data look like (show us the output from output { stdout { codec => rubydebug } } or copy and paste a document from the JSON tab in Kibana Discover) and explain exactly what you want to change in it.

Hi Badger,

Here is a JSON response being printed onto the console.

{
    "visibility" => 16093,
    "@timestamp" => 2018-08-02T18:08:40.310Z,
          "name" => "Santa Clara",
          "wind" => {
          "deg" => 290,
        "speed" => 3.1
    },
        "clouds" => {
        "all" => 1
    },
          "main" => {
        "temp_min" => 30,
        "humidity" => 25,
        "temp_max" => 30,
            "temp" => 30,
        "pressure" => 1024
    },
      "@version" => "1",
       "weather" => {
                 "id" => 800,
               "main" => "Clear",
        "description" => "clear sky",
               "icon" => "01d"
    }
}

Here is the JSON document from the JSON tab in the Kibana output

{
  "_index": "weather",
  "_type": "doc",
  "_id": "wsDX-2QBol0Vk4cfQdI2",
  "_version": 1,
  "_score": null,
  "_source": {
    "visibility": 16093,
    "@timestamp": "2018-08-02T18:12:25.421Z",
    "name": "Santa Clara",
    "wind": {
      "deg": 290,
      "speed": 3.1
    },
    "clouds": {
      "all": 1
    },
    "main": {
      "temp_min": 30,
      "humidity": 25,
      "temp_max": 30,
      "temp": 30,
      "pressure": 1024
    },
    "@version": "1",
    "weather": {
      "id": 800,
      "main": "Clear",
      "description": "clear sky",
      "icon": "01d"
    }
  },
  "fields": {
    "@timestamp": [
      "2018-08-02T18:12:25.421Z"
    ]
  },
  "sort": [
    1533233545421
  ]
}

And this is the logtstash config file I am currently using,

weather_data.conf

input {
  http_poller {
    urls => {
        weather => {
            url => "http://api.openweathermap.org/data/2.5/weather?id=5490223&appid="MY_APP_ID"&units=metric"
            headers => {
              Accept => "application/json"
            }
        }
    }
    schedule => { cron => "* * * * * *" }
    codec => json
  }
}
filter {
  mutate {
    remove_field => ["@version" ,"command" ,"host" ,"cod" ,"id" ,"base" ,"coord" ,"sys" ,"dt"]
  }
  split { field => "weather" }
}
output {
  elasticsearch {
    hosts => ["localhost:9200"]
    index => "weather"
  }
  stdout {
    codec => rubydebug
  }
}

My goal is to perform addition operation on main.temp value (main.temp = main.temp + 10) every 30 seconds, while at the same time piping data into logstash every second.

I am clueless about how this can be achieved. I'd be really grateful if you could let me know what needs to be added or changed.

Thanks!

A strange looking ask. But it could be done...

ruby { code => ' if Time.now.to_i % 30 == 0 ; event.set("[main][temp]", 10 + event.get("[main][temp]")); end' }

Thanks a lot Badger! :slight_smile: It works! The purpose of this is to simulate an environment where in if the temperature exceeds a certain threshold, a email notification is sent to the user. Since, the temperature is so constant over a small period of time, it is hard to capture drastic changes.

With that said, if I wanted it to be changed every 5 mins instead of every 30s, is the below configuration correct?

ruby { code => ' if Time.now.to_i % 300 == 0 ; event.set("[main][temp]", 10 + event.get("[main][temp]")); end' }

Yes, that is correct.

Thanks! :slight_smile:

Hey Badger,

I created a node.js server to so that I could fetch latest JSON document present in the index and send an email if the weather exceeds 40℉. But unfortunately, I cannot do that because I would need the document id in order to fetch the JSON document. So, I figured I could maybe send email directly from logstash instead.

How do I configure my logstash configuration file to send an email whenever the temperature exceeds 40℉? Could you please help me with this? I really am unable to figure out another way to do this.

Thanks in advance again!

I am not convinced you need to know the document id in advance to fetch the most recent document, since that is exactly what the Kibana Discover tab does. But that would be a question for the elasticsearch category.

You can conditionalize an output based on the fields. So something like

output {
    if [main][temp] >= 40 {
        email {
        [...]
        }
    }
}

I have never used the email output plugin, but there is documentation, and if you have a problem you can post another question and someone else might reply.

Thanks Badger,

I have created a new topic https://discuss.elastic.co/t/logstash-error-while-using-email-plugin/142840. I would appreciate if you could take a look at it and let me know what wrong I am doing.

Thanks!

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.