Kibana and reporting not working on big dataset

Mouglou · May 13, 2019, 8:35pm

Hi all !

I have a question about Kibana, and the reporting option.

In the "Discover" section, as we would like to export the result in a csv file, so we set the query, and 2 filters.
The result is about 586.675 docuements as Kibana say. So its a "small" big dataset as we have 21B of documents inside this cluster.

In Kibana, the output is display very fast, as usual with Kibana.

But if we try to export then is csv, it never works...
We get every time the error: Max attempts reached (3)

I already try to set the timeout to 6 hours, without success. Its only 500.000 rows, it shouldn't take as more than few min with my cluster size... I think !

To give you more informations, here the cluster configuration:

6 elasticsearch hardware datanodes (24 cores and 64G of ram with 31G for the heap of elasticsearch)
2 elasticsearch client nodes. VM with 4 vcpu and 6G of ram.
3 kibana nodes. Each Kibana is behind is own nginx proxy for SSL.
6.3.0 version on all services.

No load, no cpu when we try to export... it sleeping a lot !!

As we have have multiple Kibana, I set the xpack.reporting.encryptionKey: settings with the same value on each.
Also, as Kibana is behind nginx, I set the different settings xpack.reporting.kibana.*
https://www.elastic.co/guide/en/kibana/current/reporting-settings-kb.html

Soes somebody haveing the same issue, and found how to solve it ? Hope yes !!

Thanks a lot for your help !
Mouglou

joelgriffith · May 14, 2019, 5:15pm

@mouglou thanks for the question. There's definitely a few things to attempt here to rule out all the usual stuff:

Ensure you set your http.max_content_length to be able to handle the size. Both it, and Kibana's xpack.reporting.csv.maxSizeBytes will ideally be the same value. In older versions of Kibana, a mis-match here can result in your Max attempts being reached.
Ensure your nginx proxy is able to move that size of file through. It sometimes can get in the way (failing the request) and will trigger a retry by Kibana.
If either of those fail, try setting your Kibana logs higher and seeing if there's an error that propagates through: logging.verbose: true

Hope that helps!

Mouglou · May 15, 2019, 4:10pm

Hi @joelgriffith !

Thanks for your feedback.

I check, and my logs are bigger than I thinking. So yes, I will increase this 2 settings (I already try bbut only the Kibana settigns, I didn't know about the elasticsearch setting) on my Kibana nodes and my client nodes too.

But the error log is not very explicit. Max attemps reach is pretty fuzzy ! Time, size, proxy... we don't know exactly.

I going to check that today and/or tomorrow, and I come back to you then !

Thanks again !
Mouglou

Mouglou · May 15, 2019, 8:38pm

@joelgriffith

Just to let you know after some test.

I set 1Gb for both elasticsearch client nodes, and Kibana settings + timeout to 1h.

If I try to export 12h of logs, it take about 5 seconds to be done. And the file is about 139mb for about 44.100 docs.

Now I try to export 20h of logs, about 67.500 docs, and here I get a timeout. After 3h, it fails with the max reach error (which is ok, cause of 3x 1hour)...

I try 16h, about 56.800 docs, and seems to go the max attemps error again...

So it doesn't seems to be coming from "time". And not from "size" too cause my logs are preety identitcal, so the size should a little bigger, but enough to reach the 1Gb limit...

And don't fine errors on nginx. So I don't find solution for now.

Also, my goal is to export 20 days of specific logs, so it could about 500.000 as Kibana says.

I going to set Kibana in verbose mode, but really don't where it can come from...
Any idea ?

I'm pretty sure somebody made some report bigger than ours !!
Mouglou

Mouglou · May 22, 2019, 2:47pm

Hi @joelgriffith

To let you know, the only solution we found to keep the job working, is exporting by batch of 12h...

Also, when we try to get more, this is the jvm heap of the elasticsearch client node which is crashing and send the client node down...

So I will move the client from 2G to 4G of heap memory. But these is only a workaround, I think the reporting should be able to export more data without crashing my elasticsearch client node.

Have you some best practices or something I need to know about the client to be able to do the query ? Or maybe in the configuration too.

Thanks for your help !

Mouglou · May 22, 2019, 3:15pm

As we can see, during the csv export query, one of my 2 client nodes get a peak of the heap size.

But it allow me to make the same export query on 3 days ! So that a good things, but if I understand, it store in memory the resultat of the query.

If I set a month of data, I can't get a node of 128G ! (As the maximum is 32G for the jmx to keep performance)

Any explanations ? And configuration optimizationsw to allow reporting on big data set ?

Mouglou

system · June 19, 2019, 3:15pm

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Does Kibana supports a big csv exports? Kibana	4	333	December 24, 2020
Csv file export with 1 million records(approx 1 gb size) in kibana reporting Kibana	6	14708	April 10, 2020
How to export csv in kibana 7.5 with more then 1 million row Kibana	8	9797	February 17, 2020
Kibana report dataset is not similar to the discover filtered dataset Kibana elastic-stack-reporting	5	576	August 31, 2020
CSV Report generation fails with large KQL query Kibana elastic-stack-reporting , kql-kibana-query-language	2	422	September 17, 2020

Kibana and reporting not working on big dataset

Related topics