Unable to generate CSV

Hi,

I seem to be having an issue while generating reports for large amounts of data for export on CSV. I wanted to confirm if I'm hitting some kind of max size or if this is some kind of configuration issue. Currently, I am trying to pull a twitter users database for analysis on graph data, but I can only generate 20MB. While this will be a large dataset, I'm wondering if I'm hitting a hard cap on how large the export can be. Even after increasing the timeout time on kibana.yml, the maxcsv size and the http.max_content_length on elastichsearch.yml, I still cannout generate a report. Below I have included the error I am seeing in our kibana.log file:

Blockquote
{"type":"log","@timestamp":"2018-01-29T19:37:16Z","tags":["reporting","esqueue","worker","debug"],"pid":18707,"message":"jd0lg6ox0efn7d14ac7atpm2 - Failure occurred on job jd0m5q9p0efn7d14ac3iy2o0: Error: "toString()" failed\n at Buffer.toString (buffer.js:495:11)\n at MaxSizeStringBuilder.getString (/usr/share/kibana/plugins/x-pack/plugins/reporting/export_types/csv/server/lib/max_size_string_builder.js:20:46)\n at /usr/share/kibana/plugins/x-pack/plugins/reporting/export_types/csv/server/lib/generate_csv.js:51:24\n at next (native)\n at step (/usr/share/kibana/plugins/x-pack/plugins/reporting/export_types/csv/server/lib/generate_csv.js:20:191)\n at /usr/share/kibana/plugins/x-pack/plugins/reporting/export_types/csv/server/lib/generate_csv.js:20:361"}
{"type":"log","@timestamp":"2018-01-29T19:37:16Z","tags":["reporting","worker","debug"],"pid":18707,"message":"CSV: Worker error: (jd0m5q9p0efn7d14ac3iy2o0)"}
Blockquote

Is there a way to dump large datasets (~ 2GB)?

Thank you!

Did you try the xpack.reporting.csv.maxSizeBytes setting in kibana.yml?

https://www.elastic.co/guide/en/kibana/current/reporting-settings-kb.html

It's not recommended to set xpack.reporting.csv.maxSizeBytes to something as large as 2 GB. All CSV exports end up creating a single document in Elasticsearch to store the CSV which can then be downloaded later, and storing a 2 GB document isn't something you want to do. This will likely error out if you set it too high as Elasticsearch enforces a max request size, and even if you increased that setting you're going to use a large amount of heap when trying to index a 2 GB document.

so what is the solution for large datasets?

@Diego_Auza the proper solution is detailed in this GitHub Issue if you would like to give it a +1 or comment with your usage, it'll help us prioritize it appropriately.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.