Log stash not loading exact number of records in elasticsearch and on every hit results are changing

ocmvin · January 20, 2021, 7:07am

Found the solution !

Need to use order by clause in query so that records are sorted by emp_no and
logstash can search and aggregate dependant entities like titles (like One to many ).

from      employees e 
LEFT JOIN titles t ON e.emp_no = t.emp_no 
order by e.emp_no


2. Since aggregation is used here need to have single thread to process the record else
it will cause aggregation issues  (and that is where the random results you will get on multiple call to search on index as per url above) . Though it looks to be a performance hit as only 1 worker thread will process records but it can be mitigated by invoking multiple logstash config file with heterogeneous set of records e.g. first 100 emp_no in one file and 2nd hundred in other so that logstash can execute them in parallel.
so execute like below 
logstash -f logstash_config.conf -w 1

Topic		Replies	Views
Not able to load full data into elastic search by using logstash Logstash	4	1230	July 6, 2017
Logstash doesn't dump all my events to elasticsearch Logstash	14	1687	May 14, 2018
Logstash missing records in elastic search Logstash	1	683	March 28, 2018
Logstash not sending all data to elasticsearch Logstash	8	1111	June 1, 2023
Records missing while stashing data from Logstash Logstash	1	708	February 13, 2018

Log stash not loading exact number of records in elasticsearch and on every hit results are changing

Related topics