I have been using painless saved on my ES cluster for a while now on a project and little is really documented as far as how it works "under the covers" of elastic search -- at least I can't find much. Can any of the experts tell me:
How is a Painless script executed on an ES cluster once a query is received? That is, how is it executed differently than using a query?
Does sharding have a similar impact to performance for Painless scripts as ES queries?
Are Painless scripts executed in a multithreaded fashion?
What may be some best practices when using Painless?
Are there some field types stored on an ES cluster that work better with painless (e.g. keyword vs. integer)?
there are many different places we use scripts. For script query we execute the given script for every document that matches the query. Not sure if that answers your question.
sharding is our way for parallelism. 1 search request corresponds to a thread on a shard (simplified). The more shards the more parallelism. Yet, the script perf as for a single doc doesn't change.
per shared sequentially for a single request, see above.
not sure how to answer this.
I think numbers are in general preferable over strings.
Thanks for your responses, Simon. This does help some. Just to clarify my understanding, you mentioned: a painless script will execute once for each document matched in the query. So, if for example, I have an ES index containing 20 million documents and a query that returns 2 million of them, the script will execute 2 million times, one for each matching document?
well if you have a script_query for instance that you use to match the query we have to execute that script for every document that can potentially match the document. If you have 20 million matches, it will have been executed 20million times at least. But this is a lower bound, it depends on the rest of your query how often we have to check the result of the scrip to make a decision.
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.