Combining three different indices into one available to be queried by ODBC with Tableau

saif3r · May 29, 2020, 6:28am

Hello,

I have three different indices called: first, second, third.
first is the main index, second references to main and third references to second index using id fields.
The sample data looks like this:

first - 2 sample documents (two different id's)

{"first.id": 1,
"first.desc": "first_1"}

{"first.id": 2,
"first.desc": "first_2"}

second - 2 sample documents (referening to two different first.id)

{"second.id": 1,
"second.desc": "second_1",
"first.id": 1}

{"second.id": 2,
"second.desc": "second_2",
"first.id": 2}

third - 2 sample documents (refering to same second.id document)

{"third.id": 1,
"second.desc": "third_1",
"second.id": 1}

{"third.id": 2,
"second.desc": "third_2",
"second.id": 1}

I would like to automaticaly create an index that would contain documents from all three indices and is automatically updated whenever new document is indexed in any of the indices. Sample of such index would contain below records:

combined index - would contain 3 documents in total.

{"first.id": 1,
"first.desc": "first_1",
"second.id": 1,
"second.desc": "second_1",
"third.id": 1,
"second.desc": "third_1"}

{"first.id": 1,
"first.desc": "first_1",
"second.id": 1,
"second.desc": "second_1",
"third.id": 2,
"second.desc": "third_2"}

{"first.id": 2,
"first.desc": "first_2",
"second.id": 2,
"second.desc": "second_2"}

This is to make sure that our data is denormalized so that it can be pulled by Tableau using ODBC.
Is that possible using existing mechanism?

warkolm · May 29, 2020, 6:39am

You might be able to do that with https://www.elastic.co/guide/en/elasticsearch/reference/current/transform-apis.html

saif3r · May 29, 2020, 1:44pm

Thank you @warkolm,

I've looked at the link you've sent and came up with simple transformation using this data:

PUT first
{"mappings":{"properties":{"first_id":{"type":"short"},"first_desc":{"type":"keyword"}}}}
PUT second
{"mappings":{"properties":{"second_id":{"type":"short"},"second_desc":{"type":"keyword"},"first_id":{"type":"short"}}}}
PUT third
{"mappings":{"properties":{"third_id":{"type":"short"},"third_desc":{"type":"keyword"},"second_id":{"type":"short"}}}}    
PUT combined_index
{"mappings":{"properties":{"first_id":{"type":"short"},"first_desc":{"type":"keyword"},"second_id":{"type":"short"},"second_desc":{"type":"keyword"},"third_id":{"type":"short"},"third_desc":{"type":"keyword"}}}}

Sample data:

POST _bulk
{"index": {"_index": "first", "_id": "1"}}
{"first_id": 1, "first_desc": "first_1"}
{"index": {"_index": "first", "_id": "2"}}
{"first_id": 2, "first_desc": "first_2"}
{"index": {"_index": "second", "_id": "1"}}
{"second_id": 1, "second_desc": "second_1", "first_id": 1}
{"index": {"_index": "second", "_id": "2"}}
{"second_id": 2, "second_desc": "second_2", "first_id": 2}
{"index": {"_index": "third", "_id": "1"}}
{"third_id": 1, "third_desc": "third_1", "second_id": 1}
{"index": {"_index": "third", "_id": "2"}}
{"third_id": 2, "third_desc": "third_2", "second_id": 1}

I created this basic transformation:

PUT _transform/test_combined
{
  "source": {
    "index": ["first", "second", "third"]
  },
  "pivot": {
    "group_by": {
      "test": {
        "terms": {
          "field": "first_id"
        }
      }
    },
    "aggregations": {
      "min_id": {
        "min": {
          "field": "first_id"
        }
      }
    }
  },
  "description": "Transform test",
  "dest": {
    "index": "combined_index"
  }
}

(there is an error in this transform related to my fields, but ill fix that by expanding the index with additional fields)
Here is the result of GET combined_index/_search

{
  "took" : 0,
  "timed_out" : false,
  "_shards" : {
    "total" : 1,
    "successful" : 1,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
    "total" : {
      "value" : 2,
      "relation" : "eq"
    },
    "max_score" : 1.0,
    "hits" : [
      {
        "_index" : "combined_index",
        "_type" : "_doc",
        "_id" : "AGvuZWuqqz7c5ytICzX5Z74AAAAAAAAA",
        "_score" : 1.0,
        "_source" : {
          "min_id" : 1.0,
          "test" : 1
        }
      },
      {
        "_index" : "combined_index",
        "_type" : "_doc",
        "_id" : "AA3tqz9zEwuio1D73_EArycAAAAAAAAA",
        "_score" : 1.0,
        "_source" : {
          "min_id" : 2.0,
          "test" : 2
        }
      }
    ]
  }
}

Now I'm trying to figure out how exacty match values from other indices based on id field. I haven't found anything so far, except for this this code from ML webinar: https://gist.github.com/stevedodson/c80d245a5dc6ae8cc93e9f9d25897aef
But understanding how Transform raw data into entity-centric index works exactly, is difficult to grasp without some baseline.

system · June 26, 2020, 1:44pm

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Combining documents in two indices correlated by their ID, but the fieldname is different Kibana transforms	8	2477	February 18, 2022
Is it possible to combine two docs in different indices? Elasticsearch transforms	2	942	July 13, 2021
Merge different index in function two fields and paste eleement of fields Elasticsearch	9	465	February 1, 2022
Help for elk stack Elasticsearch	6	163	September 21, 2023
Merge indexes Elasticsearch	5	386	May 31, 2018

Combining three different indices into one available to be queried by ODBC with Tableau

Related topics