Found the solution !
- Need to use order by clause in query so that records are sorted by emp_no and
logstash can search and aggregate dependant entities like titles (like One to many ).
from employees e
LEFT JOIN titles t ON e.emp_no = t.emp_no
order by e.emp_no
2. Since aggregation is used here need to have single thread to process the record else
it will cause aggregation issues (and that is where the random results you will get on multiple call to search on index as per url above) . Though it looks to be a performance hit as only 1 worker thread will process records but it can be mitigated by invoking multiple logstash config file with heterogeneous set of records e.g. first 100 emp_no in one file and 2nd hundred in other so that logstash can execute them in parallel.
so execute like below
logstash -f logstash_config.conf -w 1