How Can I Identify LARGE Documents/Logs

We have a 2-master, 5-slave elasticsearch cluster collecting logs from a ton of different microservice servers. Although indexing has never been a problem, occasionally, our kibana goes down due extremely long timeouts. Sometimes I have been able to track these problems back to EXTREMELY large individual documents ruining query times. Typically these have been the result of a faulty multiline filter.

However here is my problem - sometimes when we get these time-out issues, I don't know how to identify what server is producing the massive logs because we have so many. Since most of our logs go into the same, daily index, is there any way to identify based on source (we have a "source" field in our logs) or something else which server is producing the problem logs that are freezing our queries??

Any help on this would be massively appreciated!!
Thanks!

You might need to install the _size plugin to help grab the document size - https://www.elastic.co/guide/en/elasticsearch/plugins/current/mapper-size.html

Also, there is no such thing as a slave in ES :slight_smile:

That's a fantastic plugin, thank you. Hopefully I can put it to good use. Any other ideas are still appreciated in the meantime!

Am I calling the non-master nodes incorrectly??

Yeah, there is a single active master and multiple master eligable.

Oh, I had been referring to 5 data-only, non-master eligible nodes as slaves. We have 1 master and 1 master eligible.

That's not good. Read https://www.elastic.co/guide/en/elasticsearch/guide/master/important-configuration-changes.html#_minimum_master_nodes