Reading elasticsearch data using spark SQL is too slow

Hello, i have a question about spark - es.

I write code like below

# Initializing PySpark
from pyspark import SparkContext, SparkConf, SQLContext

# Spark Config
conf = SparkConf().setAppName("es_app")
sc = SparkContext(conf=conf)

# sqlContext
sqlContext = SQLContext(sc)

# ES to dataframe
df = sqlContext.read.format("org.elasticsearch.spark.sql").option("es.nodes","xxx.xxx.xxx.xxx:9200").option("es.nodes.discovery", "true").load("sample")

# make view 
df.registerTempTable("sample")

# Too long
sqlContext.sql("SELECT count(*) from sample").show()

The 'sample' index contain 5,000,000 documents.

However when i query about sql.

It take so long time to get result. (20 min takes approximately)

Maybe something wrong, but i don't know the reason.

Do i have to add more option?

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.