Regarding Persistent Queues


#1

Hi,
Can I ask, is there a recommendation regarding the disk size that should be allocated to persistent queues? Or is it simply, the more the better?

Regards,
D


(Magnus Kessler) #2

The role of persistent queues is to provide a buffer for situations where Elasticsearch is not capable to ingest data at the rate it arrives.

If Elasticsearch is not available at all (worst case), the buffer should be big enough to cache the amount of data that would arrive during this down-time period. In other words, decide how long you can afford for Elasticsearch to be unavailable before losing new data.

The other scenario is that more data would arrive for a short amount of time than can be ingested immediately. This could happen as a result of a logging spike. Again, decide how long this condition should last before eventually data would no longer be forwarded.

Typically one would configure the persistent queues to cache data for several hours or a day. During this period, the operations team should attempt to fix the root cause for a high log volume. Load that is continually causing persistent queues to build up for longer periods should be addressed by providing additional Logstash or Elasticsearch resources.