We want to make our data (~10TB) on HDFS interactive query-able, due to the data policy, we won't be able to install ES in the same cluster of Hadoop, so es-hadoop connector is not an option for us. Currently I'm trying to generate index data on Hadoop cluster, cause we have a relatively larger Hadoop cluster (~3K) comparing to ES cluster (up to 20). But have no idea about the size of index data and how to make it available on ES cluster yet (pull via HDFS proxy?).
Do you have any idea?