Match financial transactions and merchants

I have an Elastic index with billions of transactions.

An example tx:

   "id": "54bfa9af-009a-437d-bd21-caaf651f7218",
   "amount": 100.0, 
   "currency": "EUR",
   "type": "expense",
   "note": "CARD PAYMENT TO AZAMON.COM 100.0 EUR, RATE 0.86/GBP ON 05-05-2022"

I have a few millions merchants (companies, etc.) in a RDBMS table:

id    | Name
123   | Azamon Ltd
456   | Alple Inc.
789   | Goooogle

I can easily ingest them in another Elastic index.

Now, both transaction's note and merchant's name are analyzed fields.

I would like, for every new transaction indexed, to enrich its content with a merchant ID+name. It doesn't have to be perfect, though. A threshold score could be fine tuned once the solution works for most of the matching tx.


for the tx above, I would like to obtain "123, Azamon Ltd" as a search result

Should I just create a custom tokenizer, analyzer for that, and run a query against a "merchants" index using the tx note as a search term? What would be a good pipeline structure for single-language tx/merchants matching?

Or is there a out-of-the box more efficient solution for that problem? I'm reading about NER, documents similarity and other stuff, but I can't figure out what's the best approach in my (simple) case.

Pointing me to relevant and proven doc pages will be considered an acceptable answer. TY

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.