Many to one parent child relationship


(kumar.soumitra) #1

I have an index which stores metrics of applications during its lifecycle. There are 0 or more documents when application '1' is running and 1 document when application has finished. I am using parent-child relationship to store tags of each application.

As per https://www.elastic.co/guide/en/elasticsearch/guide/current/parent-child.html , parent-child is one-to-many relationship. But, is there a way to make many-to-one relationship work using routing, and parent fields?

Here is my attempt to make it work, but it does not, is there a way? I am using elasticsearch-5.5.2 .

Lets say there are N numbers of documents for application id '1', i.e. N-1 documents in state RUNNING, and 1 in state FINISHED, and there are M number of tags associated with this application (appid '1'). I wish to index N documents in type 'apps' and M documents in type 'tags' and use has_child query to filter RUNNING and and FINISHED applications.

Here is my template definition

$ cat template.app.json
{
"template": "apps-*",
"mappings": {
"default": {
"dynamic" : "strict",
"date_detection": false,
"_routing": { "required": true },
"_all": { "enabled": false },
"_source": { "enabled": true },
"properties": {
"appid": { "type" : "string", "index" : "not_analyzed"},
"state": { "type" : "string", "index" : "not_analyzed"},
"progress": { "type": "float", "index": "no"},
"memory": { "type": "long", "index": "no"},
"date": { "type": "date"},
"key": { "index": "not_analyzed", "type": "string" },
"value": { "index": "not_analyzed", "type": "string" },
"comment": { "type": "string", "store": false }
}
},
"apps": {},
"tags": {
"_parent": { "type" : "apps" }
}
}
}

I am using _routing, and _parent field during indexing to make sure that all tags and apps documents for a particular application are indexed in the same shard.

PUT app_index/apps/1?routing=1
PUT app_index/apps/1-r1?routing=1
PUT app_index/apps/1-r2?routing=1

PUT app_index/tags/1t1?routing=1&parent=1
PUT app_index/tags/1t2?routing=1&parent=1

  1. Get all the FINISHED applications which have "user" tag. This works as expected.
    GET /_search
    {
    "query": {
    "bool": {
    "must": [
    { "term": { "state": "FINISHED" } },
    { "has_child" : { "type": "tags", "query": { "bool": { "must": [ { "term": { "key": "user" } } ] } } } }
    ]
    }
    }
    }

  2. Get all the RUNNING applications which have "user" tag. This does not work, since the parent id of tags are pointing to FINISHED apps.
    GET /_search
    {
    "query": {
    "bool": {
    "must": [
    { "term": { "state": "RUNNING" } },
    { "has_child" : { "type": "tags", "query": { "bool": { "must": [ { "term": { "key": "user" } } ] } } } }
    ]
    }
    }
    }

I don't want to duplicate tags for every RUNNING documents, is there a way to use has_child query, or any other way to query (scripted, or pipeline) to make it work? I would greatly appreciate any help here.


(system) #2

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.