Reference Tables in Elastic

Scenario:

  • I have a multi-field index which stores documents, 'index a', which has a field "id" whose value is a hexadecimal number
  • i have another index 'b' which stores a list of id's and their corresponding string names
  • i would like to use index b as a reference table for index a so that instead of displaying the hex digits i can display the matched string on kibana.

Is this possible using elastic? This sort of resembles how an inner join/where query would work in sql.

It is possible, but since Elasticsearch does not support JOINs, you would need to add the information from index b in the documents from index a.

This could be done when ingesting the data.

There are a couple of ways to do that, it will depend on how you are sending your data to Elasticsearch.

If you are sending it directly to Elasticsearch, you can use an ingest pipeline with an enrich processor to enrich your data.

If you are using Logstash, you can use a couple of filters to do that.

1 Like

would this work if we are ingesting the data using the bulk api?

Yes.

Using the bulk API means that you are sending your logs making requests directly to Elasticsearch, so you need create an ingest pipeline with the enrich processor.

Then you have two options, pass the ingest pipeline directly on the request, or add the ingest pipeline to your template as the index.default_pipeline or index.final_pipeline.

2 Likes

okay thank you. I will look into this. I appreciate your help

Hey @leandrojmp ! I got the pipeline working for one of my policies and i had a few followup questions:

  • is it possible, and if so what is the syntax or example, to have more than one enrich policy included in a single pipeline?
  • how/where can i include this pipeline into the template as you mentioned? I am unsure of the syntax

I have searched for examples on these but have not found anything useful. Any examples would be greatly appreciated.

You can have as many enrich processors you want in your pipeline, the example in the documentation shows how to add a enrich processor in an ingest pipeline.

To add another enrich, you just add another processor to the pipeline, this can be done using a API Request or using the Kibana Interface, which is easier.

PUT /_ingest/pipeline/ingest-pipeline-name
{
    "processors": [
        {
            "enrich": {
             "description": "description one",
             "policy_name": "enrich_policy_1",
             "field": "source_field_1",
             "target_field": "target_field_1"
            },
            "enrich": {
                "description": "description two",
                "policy_name": "enrich_policy_2",
                "field": "source_field_2",
                "target_field": "target_field_2"
            }
        }
    ]
}

To specify a default_pipeline or a final_pipeline in the template, you put in the settings section, the same place where you have the number_of_shards or number_of_replcas.

For example, considering the settings part of a component template, to set the default_pipeline you will need something like this:

{
    "template": {
        "settings" : {
            "index" : {
              "refresh_interval" : "30s",
              "number_of_shards" : "1",
              "number_of_replicas" : "1",
              "default_pipeline": "pipeline-name"
            }
        }
    }
}
1 Like

@leandrojmp thank you so much!

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.