Dec 14th, 2023: [EN] A Peak Inside Santa's Planning Meeting - Using ES|QL for Data Enrichment

Often, we hear of Santa’s list, but what if the Elves had a list as well? With a little imagination, we could assume Santa’s list contains the names of children and whether they were naughty or nice and the Elves list would contain the name of the toy to be delivered as well as the mode of delivery such as chimney or fire escape. Santa and his elves must come together to compare both lists to ensure only toys for kids with the Nice checkmark are built.

In the monthly planning meeting, they decide it would be more efficient to create a new list with data from both lists. This will ensure the Elves know which child should receive a toy and Santa knows which toy should be delivered along with the mode of delivery.

One of the Elves suggests using ES|QL to enrich the data by adding data from both lists for only the children who were “Nice” this year.

ES|QL is a piped query language that allows output from one command to be passed as input into another command. This makes it great for filtering, searching, transforming, and aggregating your data while still maintaining simplicity.

Santa’s list could look something like this:

     address      |  first_name   |      id       |   last_name   |    status     
------------------+---------------+---------------+---------------+---------------
123 Perry Ave     |Maya           |12045          |Nichols        |Nice           
256 Jefferson Pike|Jacob          |12046          |Meyers         |Naughty        
246 Washhington St|Tonya          |12047          |Speed          |Nice           
357 Pinecone Rd   |Peter          |12048          |Marshall       |Nice           
123 Perry Ave     |Mia            |12049          |Nichols        |Naughty      

And the Elves list could like this:

delivery_method|      id       |   toy_name    
---------------+---------------+---------------
Chimney        |12045          |bike           
Fire escape    |12046          |Ninja Turtles  
Chimney        |12047          |Easy bake oven 
Fire escape    |12048          |Magician kit   
Chimney        |12049          |Prankster Kit  

To do this, we only need to create an enrichment policy.

Index documents

First, let’s create the mapping for Santa’s list:

PUT santas_list
{
  "mappings": {
    "properties": {
      "id": {
        "type": "keyword"
      },
      "first_name": {
        "type": "keyword"
      },
      "last_name": {
        "type": "keyword"
      },
      "address": {
        "type": "keyword"
      },
      "status": {
        "type": "keyword"
      }
    }
  }
}

It’s important to note, the ES|QL enrich command only supports enrich policies of type match. This means the request must be an exact match, taking into account case sensitivity. Therefore enrich only supports enriching on a column of type keyword.

Something else to keep in mind is the naming of the id field. It is tempting to use something like _id as the name, but this will result in the field not being returned as expected. More specifically, _id field is restricted from use in aggregations, sorting and scripting. The advice here is to duplicate the content of the _id field into another field that has doc_values.

Now that we have defined our index mapping, let’s add some data:

PUT santas_list/_bulk
{"index": {}}
{"id": 12045, "first_name": "Maya", "last_name": "Nichols", "address": "123 Perry Ave", "status": "Nice"}
{"index": {}}
{"id": 12046, "first_name": "Jacob", "last_name": "Meyers", "address": "256 Jefferson Pike", "status": "Naughty"}
{"index": {}}
{"id": 12047, "first_name": "Tonya", "last_name": "Speed", "address": "246 Washhington St", "status": "Nice"}
{"index": {}}
{"id": 12048, "first_name": "Peter", "last_name": "Marshall", "address": "357 Pinecone Rd", "status": "Nice"}
{"index": {}}
{"id": 12049, "first_name": "Mia", "last_name": "Nichols", "address": "123 Perry Ave", "status": "Naughty"}

Now, let’s create the mapping for the Elves list:

PUT elves_toy_backlog
{
  "mappings": {
    "properties": {
      "id": {
        "type": "keyword"
      },
      "toy_name": {
        "type": "keyword"
      },
      "delivery_method": {
        "type": "keyword"
      }
    }
  }
}

Now, we can add the data:

PUT elves_toy_backlog/_bulk
{"index": {}}
{"id": 12045, "toy_name": "bike", "delivery_method": "Chimney"}
{"index": {}}
{"id": 12046, "toy_name": "Ninja Turtles", "delivery_method": "Fire escape"}
{"index": {}}
{"id": 12047, "toy_name": "Easy bake oven", "delivery_method": "Chimney"}
{"index": {}}
{"id": 12048, "toy_name": "Magician kit", "delivery_method": "Fire escape"}
{"index": {}}
{"id": 12049, "toy_name": "Prankster Kit", "delivery_method": "Chimney"}

Create enrich policy

For the enrich policy, we want to use the id field as our match_field to merge the two lists. For our enriched fields, we will be adding toy_name and delivery_method.

PUT /_enrich/policy/christmas-planning-meeting-policy
{
  "match": {
    "indices": "elves_toy_backlog",
    "match_field": "id",
    "enrich_fields": ["toy_name", "delivery_method"]
  }
}

Next, we need to execute the policy:

PUT /_enrich/policy/christmas-planning-meeting-policy/_execute?wait_for_completion=false

Execute ES|QL query

Finally, we can run the query:

POST /_query?format=txt
{
  "query": """
FROM santas_list
| WHERE status LIKE "Nice"
| KEEP first_name, last_name, id, status
| EVAL id
| ENRICH christmas-planning-meeting-policy ON id WITH toy_name, delivery_method
  """
}

Here, we select only the children who have a status of “Nice”. We then add the first_name, last_name, id, and status field from Santa’s list. Then, we’re able to add a new id column and use the enrich policy to add the toy_name and delivery method.

The final result should look like this:

  first_name   |   last_name   |    status     |      id       |   toy_name    |delivery_method
---------------+---------------+---------------+---------------+---------------+---------------
Maya           |Nichols        |Nice           |12045          |bike           |Chimney        
Tonya          |Speed          |Nice           |12047          |Easy bake oven |Chimney        
Peter          |Marshall       |Nice           |12048          |Magician kit   |Fire escape 

Updating/Changing Enrich Policy

Unfortunately, you can’t update or change an enrich policy. Instead, you can:

  1. Create and execute a new enrich policy.
  2. Replace the previous enrich policy with the new enrich policy in any in-use enrich processors or ES|QL queries.
  3. Use the delete enrich policy API or Index Management in Kibana to delete the previous enrich policy.

Conclusion

In this blog, we created an enrich policy to combine data from two indices. We then, implemented the policy within an ES|QL request that gave our final table.

Now, the Elves can get started building the toys for all the children on Santa's list marked as "Nice" and Santa knows which toys to deliver to each child.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.