Hello Elasticsearch Sorcerers,
I am running Logstash v7.4.0 and Elasticsearch v7.4.0 in a nicely-working pipeline. My chain of fruit stores are sending my sales information to Logstash; Logstash then pushes that data to Elasticsearch. Here’s a subsample of my Elasticsearch fruit_sales index, plus a few example data records:
product_code qnty ==================== 100 2039 200 382 300 1028 400 128
This means: I sold 2,039 units of product 100, 382 units of product 200, and so on.
My setup works great, but the trouble is it’s a pain to remember what Product Code is assigned for which fruit. On a piece of paper in my office, I have a list:
product_code name ====================== 100 apples 200 oranges 300 bananas
The list currently has several hundred items. It does not change much, but I might need to modify it from time to time. Unfortunately, there is no way to modify the data that my stores send into Logstash to include these names.
What I’d love is a solution where I can input my product_code-to-name list into ElasticSearch. And then, when ES is processing the data from Logstash, ES adds a new string field into each data record:
product_code *name* qnty ================================ 100 “apples” 2039 200 “oranges” 382 300 “bananas” 1028 400 “other” 128
Now in my index, I have both the numerical product code plus a human-readable string with the fruits’ name. This makes checking queries and reading Kibana Visualizations much, much easier. Also note that there is a default value (“
other”) in case I have a product which appears in the sales data, but not on the name list.
Another consideration: I don’t think I’ll need to update the product_code-to-name list much, but I can’t have a solution where I would need to stop and restart the ES service whenever I make an edit to that list. Of course, if I have 10,000 data records with a Name field already populated, and then, say, I change Product_Code 100 from “apples” to “papayas,” there’s no need to traverse through all existing records to change that. But every new data record from that point out needs to have the new string assignment.
Is there a graceful, non-computational way to do this? In SQL, you could do a join between two tables, in C you could create a product_code-to-name hash table, etc.
In ES, I assume I would use the “
put mapping” API to create the “name” field in my index, and I could create it as a string plus set all initial values to “
other” to enforce the default value. But I’m baffled how I could do the product_code-to-name lookup…?
Any advice on this will be wildly appreciated.
FULL DISCLOSURE: I am also posting a variation of this question to the Logstash forum, as this task might be easier on LS than ES.