Hello,
I have a data stream that I am posting grocery receipts to on an hourly basis. The receipt includes the individual items purchased, as well as a keyword array with the unique product UPC codes. I’m trying to match these receipts up against online shopping receipts (I.e. InstaCart). Their receipts don’t always have the items on the sales receipts, there are occasionally substitutions and the store isn’t always correct. I know I should get multiple results back if there is discrepancies. If I run a Boolean Must query, I only get entries when there is a 100% match. If I run a Boolean Should query, I’m getting a lot of false matches. Does anyone have a suggestion on a better way of matching this up?
Thank you for the help. Your search improved the results, but I also found the source data I was using for the matching was faulted. Some had check digits, some didn’t.
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.