I am a newbie to ES. So could you please help me on the below.
As per my requirement, I need to pull data from MYSQL to Elasticsearch for every minute. From ES I will be generating report by using BI reporting tool. My DB is very dynamic as it is the backkend for Online sales kind.
1 . What could i use to pull the data from DB to ES. (I just read somewhere river is depreceated, so please suggest me the right component/tool).
I might need to join many tables to get in my DB to get for each ES schema. Will it create any problem.
Also is it possible to make join among schemas in ES?
Also how do i can move the incremental data to ES for every minute. How can this be acheived?
Some of my queries could be very basic, but please help me to move forward.
1.1 I think it's tricky to solve this with an ETL. So I would probably write my own code which reads the DB, do the joins, and create at the end a full Object in JSON. Have a look at http://david.pilato.fr/blog/2015/05/02/devoxx-france-2015/ section "Our CRM database"
1.2 You might want to use parent / child feature which could somehow help you. But don't use that because you come from a relational system to a non relational one. Just use it if your use case requires it. In other words, read this: https://www.elastic.co/guide/en/elasticsearch/guide/current/parent-child.html
1.3 Run your batch as a CRON job. Or use options provided by the 2 tools I mentioned earlier
Thank you very much for your detailed reply. Yes, I will go thru all your links and check. I will let you know if I get stuck up while trying the solution suggested by you.
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.