We use Kibana as a UI for searching and analyzing some data that we store in Elasticsearch. The individual data entries in Elasticsearch have S3 identifiers stored with them for relevant files. We need to enable our users to download these files. We already have a service that generates pre-signed S3 links for downloading the files but we somehow need to integrate this into the Discover page and/or Dashboards.
The question is not strictly tied to S3 but rather how would you achieve this with any storage method? I’ve tried writing a custom plugin but have hard time getting it to work. Is there an easier way with some custom actions/hooks that Kibana would support out of the box that I am missing?
Yes, there is a field that contains list of S3 keys relevant to the file. We also have a service that can generate download urls for these identifiers.
Effectively you are using sone sort of key to identify the full URL from another index
The other index can be populated and kept up to date in various ways.
Without knowing your specific data flow, and based on my limited understanding , and that you have a facility to generate the full urls per id , and you don’t really care about any sort of access control (double and triple check this), it appears not that difficult.
More specific advice would need more specific details on the data flow, and the url-generating service - could that service not be ported into an ingest pipeline?
I was thinking about this solution as well, however there are 2 downsides/blockers in our specific use case.
the urls generated for download have an expiry date (max is 2 weeks). This is something that we would rather keep and is also enforced by Amazon. Thus the reason that we would ideally generate these on request for lets say 15 minutes.
Including links that would point to the generating service is an interesting thought. However, this would mean we would have to expose this service that could otherwise stay internal (Kibana and this service is running on the same closed internal network and only Kibana is exposed). This would expose us to more attack vectors. We would also need to setup some kind of authZ etc, wouldn’t be a lot of work to do. Currently the single point of access to the whole system from the outside network is only Kibana and it would be great if we could keep it that way.
Another potential downside is that if we were to generate the S3 download urls for all entries it would:
Cost us money
Wouldn’t be efficient use of our resources as the download links will be needed for 1 out of many, mamy entries. That’s why it would be great if we could generate it only when the user needs to download raw files.
I will take a look at this documentation more in depth in the morning. However, I think it will have the same problems as I mentioned in my other comment.
We really would need to be able to generate the download links on request. The downloading of raw files is a crucial functionality that we need, however, it will be used only for minimum amount of elasticsearch documents. It wont be needed in 90% of the system use cases.
The original approach that I wanted to implement was a custom plug-in that would extend the actions that are available to click when a row is selected in the “Discover” page or a Dashboard, or on hover, or a similar type of activation. This action would call a custom Kibana server route that would invoke the service (running on internal network and vidible to Kibana) to generate the download urls and then either somehow return the download links to display somewhere or automatically redirect the user.
I tried to implement the above solution but I dont have much experience with JS and frontend decelopment. So far I wasn’t able to get it working, thus my inquiry here for advice as I dont know if this is a dead end or there may be a better way to implement what I am trying to achieve.
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.