【ingest node】別のindexのデータを登録する方法について

tsgkdt · August 19, 2020, 1:22pm

Ingest Nodeですと、Enrich Processorがその機能になるかと思います。

概要はこちら
https://www.elastic.co/guide/en/elasticsearch/reference/7.9/ingest-enriching-data.html

今回の例に近い合致したデータを補完するときの参考
https://www.elastic.co/guide/en/elasticsearch/reference/7.9/match-enrich-policy-type.html

match-enrich-policy-type.htmlの方を確認いただくとイメージがわくと思います。

こちらで確認したときの手順を以下に書いて終わります。

補完用のデータの準備

POST index_c/_doc/a
{
  "a_id": "1111",
  "b_id": "aaaa"
}

POST index_b/_doc/a
{
  "b_id": "aaaa",
  "b_comment": "test"
}

Enrich Policyの作成

# index_cに対してa_idをもとに、b_idを付与する
PUT /_enrich/policy/index-c-policy
{
  "match": {
    "indices": "index_c",
    "match_field": "a_id",
    "enrich_fields": ["b_id"]
  }
}

# index_bに対してb_idをもとに、b_commentを付与する
PUT /_enrich/policy/index-b-policy
{
  "match": {
    "indices": "index_b",
    "match_field": "b_id",
    "enrich_fields": ["b_comment"]
  }
}

PolicyのExecuteの実行

Enrich Processorは、直接index_bやindex_cを参照するわけではないため、Enrich用のIndex作成

POST /_enrich/policy/index-c-policy/_execute
POST /_enrich/policy/index-b-policy/_execute

Ingest Pipelineの作成

renameやremoveが入っているのは、Enrichの結果で階層構造になっているのを戻すため。

PUT /_ingest/pipeline/test
{
  "description": "forum test",
  "processors": [
    {
      "enrich": {
        "policy_name": "index-c-policy",
        "field": "a_id",
        "target_field": "b",
        "max_matches": "1"
      }
    },
    {
      "enrich": {
        "policy_name": "index-b-policy",
        "field": "b.b_id",
        "target_field": "b",
        "max_matches": "1"
      }
    },
    { 
      "rename": {
        "field": "b.b_comment",
        "target_field": "b_comment"
      }
    },
    {
      "rename": {
        "field": "b.b_id",
        "target_field": "b_id"
      }
    },
    {
      "remove": {
        "field": "b"
      }
    }
  ]
}

データ投入

PUT /my-index-00001/_doc/1?pipeline=test
{
  "a_id": "1111"
}

確認

GET /my-index-00001/_doc/1

{
  "_index" : "my-index-00001",
  "_type" : "_doc",
  "_id" : "1",
  "_version" : 1,
  "_seq_no" : 0,
  "_primary_term" : 1,
  "found" : true,
  "_source" : {
    "a_id" : "1111",
    "b_id" : "aaaa",
    "b_comment" : "test"
  }
}

Topic		Replies	Views
【ingest node】enrich processorに関して日本語による質問・議論はこちら	3	769	September 28, 2020
Ingest node GrokにてArraylistの値を受け取りエラー日本語による質問・議論はこちら	5	838	October 28, 2019
データを取り込む際に不要な情報をインデックスから削除する方法日本語による質問・議論はこちら	3	1995	January 2, 2020
Ingest node 登録データと複数項目の突合せについて日本語による質問・議論はこちら	1	449	April 14, 2020
Elasticsearch内のデータの結合について日本語による質問・議論はこちら	8	7016	May 21, 2019