Hi All
We are using MySQL as primary db and Elasticsearch for secondary db(basically for search purpose). We are building a Job Portal. There are mainly three use cases where the search will be performed on Elasticsearch.
Job Search
CV bank Search
Applicants Search/Match for a job post
For these use cases, we need to store data of Job Post (to perform Job Search), Applicants Current CV (to perform CV bank Search), Applicants CV when applied to a job post (to perform Applicants Search /Match for a job post). We have find out probable three ways to sync/store the data in MySQL and Elasticsearch.
Use a Queue server(RabbitMQ or Kafka): When something is stored in MySQL, we will send message to Queue server which will retrieve and transform data and store into Elasticsearch.
Use Logstash: Periodically, the logstash will be responsible to search MySQL DB / Table update and store data into Elasticsearch.
Real time sync/store from application layer: We can store data in Elasticsearch just after we are storing data into MySQL from application layer/code level.
I know, all of these have pros and cons in terms of performance, real time sync/store, maintaining another server etc. But, I am not sure in which way to go? Can you please guide? Thanks in advance.
Basically, I'd recommend modifying the application layer if possible and send data to elasticsearch in the same "transaction" as you are sending your data to the database.
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.