Elasticsearch past indexes

billy6 · January 5, 2016, 8:49am

Hello everybody.
I would like to discuss with you about something I have notice while I was doing some tests.
I had an application sending messages to a machine where I have a logstash instance and REDIS, and another machine which has a logstash instance with ES.

In the logstash instance placed in the first machine I only redirect the data to REDIS without using any filter, only input{...} and output{...}. I get the tags and the timestamp of the message when it comes to the second machine, and I put in ES an index per day.
I had a problem with ES and I had to erase the index of a day ( December 26th ). When I restored everything I started to receive data in the index of the day I have erased ( December 26th ) instead of the present day (January 4th). It was strange for me because all the data I received had the timestamp of December 26th, and I received 1.82 GB of data! So, it worked like if it had an storage queue in the 2nd logstash.

In logstash we have a thread for every part, input, filter and output, and they are comunicating with each other using a queue, isn't it? How could I control the size of this queue? Is there any parameter to configure it?

magnusbaeck · January 5, 2016, 9:28am

An event is stored in the Elasticsearch index that corresponds to the event's timestamp (the @timestamp field), so it's completely normal and expected that events from Dec 26 are stored in the Dec 26 index even if they're processed on Jan 4.

Logstash has two internal queues with room for 20 events each. This isn't configurable.

billy6 · January 5, 2016, 9:47am

Hi @magnusbaeck
Thank you very much for your answer.
Yes, but I erased the 26th December index before restarting the system, so where were they stored if I didn't have that index ?

magnusbaeck · January 5, 2016, 9:57am

So the question isn't really "why did the events end up in the Dec 26 index" but actually "why did Logstash read Dec 26 events at all". It seems your log source still had unprocessed Dec 26 events. Without further clues about where your events came from it's impossible to tell.

What is clear is that Logstash itself has no 1.8 GB internal queue and it has no feature to detect deleted indexes and reprocessing the data.

billy6 · January 5, 2016, 10:10am

Thanks @magnusbaeck
The log source is a simple UDP transmitter. So...It's an UDP transmission, it is not worried about the destination availability, and I have no queue for this transmission.

I think this could be an Elasticsearch topic...

warkolm · January 5, 2016, 8:40pm

ES simply stores whatever LS tells it to, so I think your problem is not in ES.

Topic		Replies	Views
Logstash, Could not index event to Elasticsearch Logstash	4	1589	August 4, 2020
Logstash type disappeared from Elasticsearch Elasticsearch	3	338	July 6, 2017
Logstash pipeline blocks every write if one index isn't available Logstash	1	228	July 6, 2020
What happens if ES has a index issue? Logstash	2	282	January 16, 2019
Issues with logstash sustained throughput Elasticsearch	2	454	July 6, 2017

Elasticsearch past indexes

Related topics