Not able to download all rows of data in CSV

hjazz6 · July 18, 2023, 9:09am

Hi,

I have a query that generated 25 hits, and I have selected a few columns to display in the tabular format. When I tried to export the results by generating a CSV report, I'm only able to export 15 rows, not all 25 rows. My ES cluster consists of 3 servers.

If I login to the same Kibana from another location and ran the same query with the same 25 results, I'm only able to generate CSV containing 6 rows this time.

Why am I not able to export all 25 rows into CSV? The results are small, less than 1MB in total.

Thank you.

Marco_Liberati · July 18, 2023, 9:28am

Hi @hjazz6

is it a Discover saved search/table or a Lens/visualization type of table?

hjazz6 · July 18, 2023, 9:41am

It is from Discover. It is not saved.

Marco_Liberati · July 18, 2023, 9:45am

Can I ask how many columns are there? Is there any free text column?

hjazz6 · July 18, 2023, 10:12am

There are 11 columns, and there are free-text columns. One of the warnings I see is there are characters which spreadsheet applications could interpret as formulas. However, such rows are exported correctly.

Marco_Liberati · July 18, 2023, 10:36am

Yes, those are just warnings about sanitisation taking place to avoid security issues.

Marco_Liberati · July 18, 2023, 10:37am

Can you see any log on the server side about CSV reporting?

hjazz6 · July 19, 2023, 1:03am

I'm not sure if this is related, but I was hitting the high disk watermark (90%), and also getting the message unexpected error while indexing monitoring document. I've deleted some indices, and I have not gotten these messages since. The cluster health is also green now.

But I've tried to export the search results again, and I'm now only getting 4 out of 24 results. I don't see any messages on the ES servers about the CSV reporting.

Marco_Liberati · July 19, 2023, 6:41am

@tsullivan can you help with this?

tsullivan · July 19, 2023, 5:11pm

Hi,

Which version of Kibana is this?
Are you using frozen data tiers or cross-cluster search?

CSV reports are a Kibana feature, so let's try looking for logs in the Kibana server. It would help to enable debug logs

hjazz6 · July 20, 2023, 1:47am

I'm using 8.3.3. This is a cross-cluster search, across 3 ES nodes.

Thanks for the suggestion on enabling debug logging on Kibana. I have done so, and there is something that I noticed.

The results I got on the Discover app span across multiple indices. Out of the 12 results I have, 7 were exported in the CSV (I'm running search on last 7 days, so the number of results will keep changing day-to-day). I noticed that the 5 results that were not exported were from indices xxx-1751, xxx-1767, xxx-1774, xxx-1784, and xxx-1825.

In the Kibana log, for log.logger: "plugins.reporting.runTask.csv_searchsource.execute-job:xxx", in the message: Results metadata: , I see

"_shards": {
   "failed": 940,
   "failures": [
      {
         "index": "xxx-001737",
         "node": "xxx",
         "reason": {
            "reason": "Trying to create too many scroll contexts. Must be less than or equal to: [500]. This limit can be set by changing the [search.max_open_scroll_context] setting",
            "type": "exception"
         },
         "shard": 0
      },
      ...
   ],
   "skipped": 0,
   "successful": 1500,
   "total": 2440
}

The Trying to create too many scroll contexts reason occurred for indices 1737 to 1851, which includes the 5 indices where the results were not exported to CSV. The other 7 exported results were in indices not in this range. Could this be the reason for why those 5 results were missing from the CSV?

tsullivan · July 20, 2023, 5:50pm

Hi, thank you for including the error message content. If I understand this message which is returned from Elasticsearch, it looks like the range of your data that you intend to export covers 940 shards.

In 8.3, CSV export uses the scan-and-scroll API in Elasticsearch. There are a few key limitations in this API:

Elasticsearch creates a scroll context for each shard that spans the data for the query.
Elasticsearch has a limit of 500 scroll contexts permitted in the cluster

Starting in v8.6, CSV export uses the point-in-time API to collect all the data for a report. This API doesn't use the limited scroll contexts.

Note that the point-in-time export comes with it's own set of limitations:

The security privileges of the reporting user must be granted on direct indices and backing indices. Having permission on the alias only will not work
If any shards are unavailable at the time of the search, the export will result in an empty CSV file.

Another thing, since you are using cross-cluster search, with any version of the stack, you may be faced with internal timeout issues since Elasticsearch may take longer than 30 seconds to return the data to the Kibana Reporting plugin. I recommend increasing your xpack.reporting.scroll.duration setting higher than the 30 second default. I have seen cases in the field where this value must be set even higher than 30 minutes. New documentation about this is actively being worked on.

hjazz6 · July 21, 2023, 2:24am

If I have about 2K+ shards at any one time, and my query will cover the entire index, should I set the scroll context higher via the search.max_open_scroll_context setting, to say, 3000?

Thanks for this suggestion. I will increase this value.

tsullivan · July 24, 2023, 10:53pm

I recommend not covering the entire index in a single export, and stick to smaller selections to export that work with the Kibana defaults.

Kibana has limits that are in place so the product can deliver the most value to the most consumers in a simple user-friendly way. When the user case demands working outside of those limits, it may be better to just use the Elasticsearch APIs directly in a custom script or application.

APIs:

_pit: Internally, the latest version of Kibana CSV reporting uses the Point-In-Time API: Point in time API | Elasticsearch Guide [8.11] | Elastic
_search/scroll: 8.3 version uses this API, so you might not want to use this one: Scroll API | Elasticsearch Guide [8.11] | Elastic

It might interest you to see if the SQL REST API could work for your use case, as it offers CSV as an output format: Response Data Formats | Elasticsearch Guide [8.11] | Elastic

system · August 21, 2023, 10:53pm

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Kibana : 5.4.3 Not able to download more 500 rows in kibana csv exports Kibana	4	946	March 27, 2018
CSV Export in Kibana not showing all the records Kibana	8	6007	June 17, 2020
Export has limited columns Kibana	12	2276	November 3, 2021
No data after CSV Export - Kibana Kibana	6	3533	January 5, 2022
Export CSV limited to 500 records Kibana	5	5688	September 27, 2019

Not able to download all rows of data in CSV

Related topics