Kibana and reporting not working on big dataset

Hi all !

I have a question about Kibana, and the reporting option.

In the "Discover" section, as we would like to export the result in a csv file, so we set the query, and 2 filters.
The result is about 586.675 docuements as Kibana say. So its a "small" big dataset as we have 21B of documents inside this cluster.

In Kibana, the output is display very fast, as usual with Kibana.

But if we try to export then is csv, it never works...
We get every time the error: Max attempts reached (3)

I already try to set the timeout to 6 hours, without success. Its only 500.000 rows, it shouldn't take as more than few min with my cluster size... I think !

To give you more informations, here the cluster configuration:

  • 6 elasticsearch hardware datanodes (24 cores and 64G of ram with 31G for the heap of elasticsearch)
  • 2 elasticsearch client nodes. VM with 4 vcpu and 6G of ram.
  • 3 kibana nodes. Each Kibana is behind is own nginx proxy for SSL.
  • 6.3.0 version on all services.

No load, no cpu when we try to export... it sleeping a lot !!

As we have have multiple Kibana, I set the xpack.reporting.encryptionKey: settings with the same value on each.
Also, as Kibana is behind nginx, I set the different settings xpack.reporting.kibana.*
https://www.elastic.co/guide/en/kibana/current/reporting-settings-kb.html

Soes somebody haveing the same issue, and found how to solve it ? Hope yes !!

Thanks a lot for your help !
Mouglou

@mouglou thanks for the question. There's definitely a few things to attempt here to rule out all the usual stuff:

  • Ensure you set your http.max_content_length to be able to handle the size. Both it, and Kibana's xpack.reporting.csv.maxSizeBytes will ideally be the same value. In older versions of Kibana, a mis-match here can result in your Max attempts being reached.

  • Ensure your nginx proxy is able to move that size of file through. It sometimes can get in the way (failing the request) and will trigger a retry by Kibana.

  • If either of those fail, try setting your Kibana logs higher and seeing if there's an error that propagates through: logging.verbose: true

Hope that helps!

Hi @joelgriffith !

Thanks for your feedback.

I check, and my logs are bigger than I thinking. So yes, I will increase this 2 settings (I already try bbut only the Kibana settigns, I didn't know about the elasticsearch setting) on my Kibana nodes and my client nodes too.

But the error log is not very explicit. Max attemps reach is pretty fuzzy ! Time, size, proxy... we don't know exactly.

I going to check that today and/or tomorrow, and I come back to you then !

Thanks again !
Mouglou

@joelgriffith

Just to let you know after some test.

I set 1Gb for both elasticsearch client nodes, and Kibana settings + timeout to 1h.

If I try to export 12h of logs, it take about 5 seconds to be done. And the file is about 139mb for about 44.100 docs.

Now I try to export 20h of logs, about 67.500 docs, and here I get a timeout. After 3h, it fails with the max reach error (which is ok, cause of 3x 1hour)...

I try 16h, about 56.800 docs, and seems to go the max attemps error again...

So it doesn't seems to be coming from "time". And not from "size" too cause my logs are preety identitcal, so the size should a little bigger, but enough to reach the 1Gb limit...

And don't fine errors on nginx. So I don't find solution for now.

Also, my goal is to export 20 days of specific logs, so it could about 500.000 as Kibana says.

I going to set Kibana in verbose mode, but really don't where it can come from...
Any idea ?

I'm pretty sure somebody made some report bigger than ours !! :wink:
Mouglou

Hi @joelgriffith

To let you know, the only solution we found to keep the job working, is exporting by batch of 12h...

Also, when we try to get more, this is the jvm heap of the elasticsearch client node which is crashing and send the client node down...

So I will move the client from 2G to 4G of heap memory. But these is only a workaround, I think the reporting should be able to export more data without crashing my elasticsearch client node.

Have you some best practices or something I need to know about the client to be able to do the query ? Or maybe in the configuration too.

Thanks for your help !

As we can see, during the csv export query, one of my 2 client nodes get a peak of the heap size.

But it allow me to make the same export query on 3 days ! So that a good things, but if I understand, it store in memory the resultat of the query.

If I set a month of data, I can't get a node of 128G ! (As the maximum is 32G for the jmx to keep performance)

Any explanations ? And configuration optimizationsw to allow reporting on big data set ?

Mouglou

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.