How Can I Identify LARGE Documents/Logs

Maxwell_Flanders · April 13, 2016, 9:50pm

We have a 2-master, 5-slave elasticsearch cluster collecting logs from a ton of different microservice servers. Although indexing has never been a problem, occasionally, our kibana goes down due extremely long timeouts. Sometimes I have been able to track these problems back to EXTREMELY large individual documents ruining query times. Typically these have been the result of a faulty multiline filter.

However here is my problem - sometimes when we get these time-out issues, I don't know how to identify what server is producing the massive logs because we have so many. Since most of our logs go into the same, daily index, is there any way to identify based on source (we have a "source" field in our logs) or something else which server is producing the problem logs that are freezing our queries??

Any help on this would be massively appreciated!!
Thanks!

warkolm · April 13, 2016, 11:22pm

You might need to install the _size plugin to help grab the document size - https://www.elastic.co/guide/en/elasticsearch/plugins/current/mapper-size.html

Also, there is no such thing as a slave in ES

Maxwell_Flanders · April 14, 2016, 1:30am

That's a fantastic plugin, thank you. Hopefully I can put it to good use. Any other ideas are still appreciated in the meantime!

Am I calling the non-master nodes incorrectly??

warkolm · April 14, 2016, 3:54am

Yeah, there is a single active master and multiple master eligable.

Maxwell_Flanders · April 14, 2016, 2:53pm

Oh, I had been referring to 5 data-only, non-master eligible nodes as slaves. We have 1 master and 1 master eligible.

warkolm · April 14, 2016, 11:01pm

That's not good. Read Important Configuration Changes | Elasticsearch: The Definitive Guide [master] | Elastic