As part of a migration away from OpenSearch, we have been wanting to send data from AWS Kinesis Firehose to Elastic Cloud. This has historically required workarounds using either a transformation Lambda or a custom HTTP destination, which acts as a proxy between Firehose and Elastic Cloud. This is both inconvenient and complicates things.
We were super happy to see the new Elastic destination for Kinesis Firehose:
However, as excited as we were, we were equally disappointed to see that this solution is limited to fixed datasets, i.e. log types. This seems like a super strange choice, as there are plenty of people needing to send custom data to Elastic Cloud. We have been asking for this for years, and so have many others.
We really need a convenient way of getting our custom documents from Kinesis Firehose into Elastic Cloud (into a custom index or data stream). We are surprised and a bit puzzled as to why this was not baked into this new solution from the beginning, as it is by no means an edge case.
So our questions are along the lines of:
Why is the destination limited to a specific use case? This seems odd.
Are there any plans of allowing custom data in the future?
Are there any good workarounds other than the ones I described above?
This solution was so close to solving a lot of issues with connecting AWS and Elastic Cloud, but this design choice means that it doesn't quite solve it. We think it would make a lot of sense to broaden its scope, both for us and for others.
Hi, thanks for raising this. The docs are quite focussed on standard logging use cases as these will be the most common.
It looks like you can just specify your own data stream name though es_datastream_name
Have you tried that? If you can send the data to your own data stream you can also control mappings, ingest pipelines etc. I believe it is only giving examples of how to route it if you want to use the integrations.
Honestly I had only read the documentation and just understood that the value for the es_datastream_name parameter had to be one of the listed ones. It didn't even cross my mind that it was possible to specify a custom data stream, perhaps because I was mislead by the docs being so focused on logging.
I will definitely have to test that, as it would entirely solve my problem. Thank you very much for the suggestion!
Until I do test it, if anyone knows if this is possible, I would appreciate a heads up
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.