we are trying to build recommendation system with Prediction.io based on Universal Recommender template which using Elasticsearch as storage.
We have problem with recommendation training performance. It took about 22 hours to run training job on spark for only 20M of rows.
We try to solve the problem on Prediction.io forum ( https://groups.google.com/forum/#!topic/actionml-user/ITni3j_6HiY ) but they told us that the problem is probably in Elasticsearch library ( "The code running is in an Elasticsearch library that writes an RDD to ES." ).
We started with 1 ES node (24 CPU cores, 64 GB RAM). After we find out that it's probably ES problem, we add 2 more ES nodes. But no matter how many ES nodes we have it still took same time.
Is there someone who can help me to debug and fix the problem?