We are moving data between clouds using spark and ES.
Is there anyway to read using gzip compressions ? Any advice how we can move daily big amount of data from an active index to a spark cluster hosted in another cloud ?
This is a frequent ask from the community, one that I am certainly on board with. Sadly, at the moment we do not support it. We're currently using the built in Apache HTTP client implementation that ships with the Hadoop ecosystem libraries. In Hadoop 2.8+ this library is scrapped, so we're looking to align with the Elasticsearch project and adopt their low level rest client (potentially shading it and packaging it in with the connector). That library is modern enough to support compression options like gzip.
This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.