Elasticsearch data indexing for logstash


(Raghu Eswaraiah) #1

As we know elasticsearch stores the logstash indices in the format logstash-yyyy.mm.dd. Does elastic search creates new indexes for new date by re-indexing previous day indexes?

Ex: I am observing every day logstash-* folders content in getting increased by twice the previous day size.
logstash-2015.11.17 was ~500MB,
logstash-2015.11.18 was ~1.5 GB and
logstash-2015.11.19 is > 3 GB


(Magnus Bäck) #2

As we know elasticsearch stores the logstash indices in the format logstash-yyyy.mm.dd. Does elastic search creates new indexes for new date by re-indexing previous day indexes?

No, Elasticsearch doesn't reindex data on its own.

Ex: I am observing every day logstash-* folders content in getting increased by twice the previous day size.
logstash-2015.11.17 was ~500MB,
logstash-2015.11.18 was ~1.5 GB and
logstash-2015.11.19 is > 3 GB

And you're not just logging more data?


(Raghu Eswaraiah) #3

If not, how logstash folder is becoming twice the previous day's size?


(Drew Town) #4

Did you upgrade to 2.0? What is the document count on each index? Did your documents get a lot bigger? Did you turn on doc values?


(Raghu Eswaraiah) #5

Yes, i am using elasticsearch 2.0. and Yes every day my document's count is getting increased.


(Raghu Eswaraiah) #6

With respect to above query below is some more information/issues.

  1. As per Marvel
    logstash-2015.11.19 data size is 1,004.8MB and Document Count is 2.4m
    logstash-2015.11.20 data size is 263.7MB and Document count is 618.5k

  2. But I am not able to view any data related to logstash-2015.11.20 in Kibana.

  3. Later investigated the log files and got to know last update to elasticsearch index happened on 2015-11-19 22:04:09,842 and last good contact between logstash and elasticsearch is at ~Thu Nov 19 23:00:00 CST 2015.

  4. So while creating new indices folder by elasticsearch, logstash is loosing connectivity with elasticsearch. I have observed the same behaviour on 2015.11.17 but restarting logstash instance resolved the problem but triggered my initial query on this topic.

My ELK stack setup flow

Server1 -> Server2 -> Server3
Server1 -> Logstash forwarder
Server2 -> Logstash lumberjack input plugin -> Logstash Kafka output plugin
Server3 -> Logstash kafka input plugin -> grok filter plugin -> Logstash elasticsearch output plugin.

This is the POC i am working and planning to implement production setup in next week. Any suggestions will be great help.


(Raghu Eswaraiah) #7

Some more information.

After executing below steps, data started getting coming in Kibana.

  1. Restarted logstash instance
  2. Restated elasticsearch instance

After above both steps, still not able to view the data in Kibana.

Then in Kibana i have created new index pattern as logstash-2011.11.20 from settings ->indices -> create.

As i mentioned in the topic ElasticSearch indexing events from previous date into current date logstash-* folder, i started seeing the data populating in Kibana.


(Raghu Eswaraiah) #8

Below is screen shot for index sizes for 2 days.


(Magnus Bäck) #9

At this point it's not clear to me what the problem is. You're populating ES and it sounds like you can see the data in Kibana.


(Raghu Eswaraiah) #10

Above comparison for one of the question in the same topic. Below the question asked to provide the data.

What is the document count on each index? Did your documents get a lot bigger? Did you turn on doc values?

Nope I am not seeing any data in kibana. I could see EC indexed the data from Nov-26 to Nov-29. But i could see the data till Nov-26.

Again EC stopped indexing data from Nov-30.

I am not sure, if i am missing any configuration settings.


(system) #11