Not able to download all rows of data in CSV

Hi,

I have a query that generated 25 hits, and I have selected a few columns to display in the tabular format. When I tried to export the results by generating a CSV report, I'm only able to export 15 rows, not all 25 rows. My ES cluster consists of 3 servers.

If I login to the same Kibana from another location and ran the same query with the same 25 results, I'm only able to generate CSV containing 6 rows this time.

Why am I not able to export all 25 rows into CSV? The results are small, less than 1MB in total.

Thank you.

Hi @hjazz6

is it a Discover saved search/table or a Lens/visualization type of table?

It is from Discover. It is not saved.

Can I ask how many columns are there? Is there any free text column?

There are 11 columns, and there are free-text columns. One of the warnings I see is there are characters which spreadsheet applications could interpret as formulas. However, such rows are exported correctly.

Yes, those are just warnings about sanitisation taking place to avoid security issues.

Can you see any log on the server side about CSV reporting?

I'm not sure if this is related, but I was hitting the high disk watermark (90%), and also getting the message unexpected error while indexing monitoring document. I've deleted some indices, and I have not gotten these messages since. The cluster health is also green now.

But I've tried to export the search results again, and I'm now only getting 4 out of 24 results. I don't see any messages on the ES servers about the CSV reporting.

@tsullivan can you help with this?

Hi,

  • Which version of Kibana is this?
  • Are you using frozen data tiers or cross-cluster search?

CSV reports are a Kibana feature, so let's try looking for logs in the Kibana server. It would help to enable debug logs

I'm using 8.3.3. This is a cross-cluster search, across 3 ES nodes.

Thanks for the suggestion on enabling debug logging on Kibana. I have done so, and there is something that I noticed.

The results I got on the Discover app span across multiple indices. Out of the 12 results I have, 7 were exported in the CSV (I'm running search on last 7 days, so the number of results will keep changing day-to-day). I noticed that the 5 results that were not exported were from indices xxx-1751, xxx-1767, xxx-1774, xxx-1784, and xxx-1825.

In the Kibana log, for log.logger: "plugins.reporting.runTask.csv_searchsource.execute-job:xxx", in the message: Results metadata: , I see

"_shards": {
   "failed": 940,
   "failures": [
      {
         "index": "xxx-001737",
         "node": "xxx",
         "reason": {
            "reason": "Trying to create too many scroll contexts. Must be less than or equal to: [500]. This limit can be set by changing the [search.max_open_scroll_context] setting",
            "type": "exception"
         },
         "shard": 0
      },
      ...
   ],
   "skipped": 0,
   "successful": 1500,
   "total": 2440
}

The Trying to create too many scroll contexts reason occurred for indices 1737 to 1851, which includes the 5 indices where the results were not exported to CSV. The other 7 exported results were in indices not in this range. Could this be the reason for why those 5 results were missing from the CSV?

Hi, thank you for including the error message content. If I understand this message which is returned from Elasticsearch, it looks like the range of your data that you intend to export covers 940 shards.

In 8.3, CSV export uses the scan-and-scroll API in Elasticsearch. There are a few key limitations in this API:

  • Elasticsearch creates a scroll context for each shard that spans the data for the query.
  • Elasticsearch has a limit of 500 scroll contexts permitted in the cluster

Starting in v8.6, CSV export uses the point-in-time API to collect all the data for a report. This API doesn't use the limited scroll contexts.

Note that the point-in-time export comes with it's own set of limitations:

  • The security privileges of the reporting user must be granted on direct indices and backing indices. Having permission on the alias only will not work
  • If any shards are unavailable at the time of the search, the export will result in an empty CSV file.

Another thing, since you are using cross-cluster search, with any version of the stack, you may be faced with internal timeout issues since Elasticsearch may take longer than 30 seconds to return the data to the Kibana Reporting plugin. I recommend increasing your xpack.reporting.scroll.duration setting higher than the 30 second default. I have seen cases in the field where this value must be set even higher than 30 minutes. New documentation about this is actively being worked on.

If I have about 2K+ shards at any one time, and my query will cover the entire index, should I set the scroll context higher via the search.max_open_scroll_context setting, to say, 3000?

Thanks for this suggestion. I will increase this value.

I recommend not covering the entire index in a single export, and stick to smaller selections to export that work with the Kibana defaults.

Kibana has limits that are in place so the product can deliver the most value to the most consumers in a simple user-friendly way. When the user case demands working outside of those limits, it may be better to just use the Elasticsearch APIs directly in a custom script or application.

APIs:

It might interest you to see if the SQL REST API could work for your use case, as it offers CSV as an output format: Response Data Formats | Elasticsearch Guide [8.11] | Elastic

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.