My scenario is that I have 5 fields in an existing index "emp_data" in elasticsearch. One of these field is employee_name. I have to add two new sixth & seventh fields "department", "company" to which this employee belongs. The dataset "emp_data" is too big in range of 20 million documents and keep on updating in hourly basis.
I have a small file containing "employee_name","company","department" information in csv format of few kb size.
I can take this file into a separate index say "corp_data" into elasticsearch.
My question is best way in elasticsearch to :
- Join emp_data(big datas set) with corp_data( very small dataset)
My restrictions are : I can't do a join outside ES because emp_data is loaded by a application for which I don't have access to. I can do it only after emp_data is already populated.
Previously in a sql database I was using a ETL to join these two data sets and populate 7 fields in into a new table "emp_record" than emp_data and corp_data.
I am new to es.