Match financial transactions and merchants

fabiobozzo · May 12, 2022, 2:35pm

I have an Elastic index with billions of transactions.

An example tx:

{
   "id": "54bfa9af-009a-437d-bd21-caaf651f7218",
   "amount": 100.0, 
   "currency": "EUR",
   "type": "expense",
   ...
   "note": "CARD PAYMENT TO AZAMON.COM 100.0 EUR, RATE 0.86/GBP ON 05-05-2022"
}

I have a few millions merchants (companies, etc.) in a RDBMS table:

id    | Name
123   | Azamon Ltd
456   | Alple Inc.
789   | Goooogle
...

I can easily ingest them in another Elastic index.

Now, both transaction's note and merchant's name are analyzed fields.

I would like, for every new transaction indexed, to enrich its content with a merchant ID+name. It doesn't have to be perfect, though. A threshold score could be fine tuned once the solution works for most of the matching tx.

e.g.

for the tx above, I would like to obtain "123, Azamon Ltd" as a search result

Should I just create a custom tokenizer, analyzer for that, and run a query against a "merchants" index using the tx note as a search term? What would be a good pipeline structure for single-language tx/merchants matching?

Or is there a out-of-the box more efficient solution for that problem? I'm reading about NER, documents similarity and other stuff, but I can't figure out what's the best approach in my (simple) case.

Pointing me to relevant and proven doc pages will be considered an acceptable answer. TY

system · June 9, 2022, 2:36pm

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Searhch Request Elasticsearch	13	443	July 6, 2017
Indexing custom Lucene documents Elasticsearch	6	586	July 6, 2017
Group similar product Elasticsearch	6	562	July 6, 2017
How do I build a query such that each token in a document field is matched? Elasticsearch	12	2014	July 6, 2017
Special analyzer for ecommerce app Elasticsearch	1	306	July 6, 2017

Match financial transactions and merchants

Related topics