It created a new field and all the Categories are assigned based on the reason.
But the problem is, after sometime, some of the the values inside the categories show '-', even with 'reason', and they eventually show more and more records with '-'. And if I run the query again, theres no more '-'. and the cycle repeats.
what could be the issue and how to address this? Thanks for the help!
It sounds to me like you are indexing new data, for which the category is not set, or updating documents by overwriting with documents that do not have the categpry set. When you run update by query you only update the documents currently in the index and future inserts and updates are not affected. If you want the category to always be populated you could run your script in an ingext pipeline and have it apply to all cahnages, which could remove the need to run update-by-query.
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.