Which part is potentially slow in Native Java Script?

I have indexed 500K documents, each with maximum of 1000 words. The query is also a text of maximum 1000 words. I use a modified version of Cosine Similarity Script provided by imotov. And my problem is that the query would take about 5 to 6 seconds. The same query if using default scoring by Elasticsearch would take about 200ms! My cluster has 2 nodes with 5 shards and 1 replica for each shard.

The script access the text field of each indexed document and calls df() and tf() functions in IndexTermField in order to compute dot product. It also accesses a (double) value in each indexed document via ScriptDocValues to normalize the dot product. So my question is: which of these calls is potentially slow that makes such slow query response?

Thanks a lot!