Index parent child relation using Logstash (one to many) Configuration issue

malhotras · July 18, 2017, 7:31am

I am using parent-child relation between 2 entities. They are fetched from the database where they come from 2 different tables. I use the SQL join statement for them in logstash config file. Details of the config file:

input {
jdbc {
jdbc_connection_string => "url"
jdbc_user => "user"
jdbc_password => "pswd"
jdbc_driver_library => "./ifxjdbc-3-50-JC7.jar"
jdbc_driver_class => "com.informix.jdbc.IfxDriver"
statement =>  ["SELECT st1.name as s_name, st1.pnumber, st1.mnumber, st2.name as comp_name, zen.s_id, zen.comp_id, zen.conc_1, zen.conc_2 FROM sub_zen zen join sub st1 on st1.id = zen.s_id join sub st2 on st2.id = zen.comp_id"]}}

filter {mutate {remove_field =>["@timestamp"]}}

output {
stdout { codec => json_lines }
elasticsearch {
"manage_template" => "false"
"hosts" => "url"
"index" => "test-migrate"
"parent"  => "%{s_id}"}}

So, what I am doing is I am joining 2 tables first on "id" which is primary key in Table sub (st1 & st2) with table sub_zen first on "s_id" and then "comp_id". So, id from sub can be either s_id or comp_id in sub_zen.

I define the mapping in ES to have dynamic strict mapping and set manage_template false in logstash. Now, I would like to index the data into ES where I will have 2 different types.

Sub with fields: s_id,s_name,mnumber and pnumber
Comp with fields: comp_id, comp_name, conc_1,conc_2

I have defined my mapping in ES already:

curl -GET url:9200/test-migrate/_mapping?pretty 

{
"test-migrate" : {
"mappings" : {
"sub" : {
"dynamic" : "strict",
"properties" : {
"mnumber" : {
"type" : "long"
 },
 "pnumber" : {
 "type" : "long"
 },
 "s_id" : {
"type" : "long"
 },
"s_name" : {
"type" : "text",
"fields" : {
"keyword" : {
"type" : "keyword",
"ignore_above" : 256
}
}
}
}
},
"comp" : {
"dynamic" : "strict",
"_parent" : {
"type" : "sub"
},
"_routing" : {
"required" : true
},
"properties" : {
"comp_id" : {
"type" : "long"
},
"comp_name" : {
"type" : "text",
"fields" : {
"keyword" : {
"type" : "keyword",
"ignore_above" : 256
}
}
},
"conc_1" : {
"type" : "float"
},
"conc_2" : {
"type" : "float"
}
}
}
}
}
}

The current logstash config file does not index the data correctly as i would like, what shall i change? Any pointers? How can i guide logstash to index fields (id,name,pnumber,mnumber) to sub and (id,name,conc_1,conc_2) to comp and also respect parent child relation. Here sub is a parent which can have many comp child. Any pointers?

fbaligand · July 19, 2017, 5:43pm

Hi,

Have you tried to use aggregate plugin, using this sample use case ?
https://www.elastic.co/guide/en/logstash/current/plugins-filters-aggregate.html#plugins-filters-aggregate-example4

Then, could you give an example of the target document (or target documents) you wish in elasticsearch, for one zen related to 1-n st1 and 1-m st2 ?
Please format it in json or rubydebug format.

malhotras · July 20, 2017, 4:41am

Hi,

I have solved my problem yesterday already (was short on time). But, thanks for your help and suggestion.

fbaligand · July 20, 2017, 4:54am

Ok, Nice to see you solved your problem !
For info, did you use aggregate plugin ?

malhotras · July 20, 2017, 5:19am

Yeah, I did. I used nested structure. Used the elasticsearch template provided by logstash. Took hints from your
reply to other posts.. The output works and gives me results. But, I am still wondering if there is better way to organize the mapping. So, I have say sub as a parent with many comp childs. Right now, I have structure like this:

[
{
"parent": {
"child": [
{
"conc_1": 1,
"conc_2": 2,
"comp_id": 129,
"comp_name": "abc"
},
{
"conc_2": 1,
"conc_1": 15,
"comp_id": 11,
"comp_name": "abc1"
}
]
},
"pnumber": 200,
"@timestamp": "abc",
"sub_name": "Eb",
"@version": "1",
"mnumber": 21,
"s_id": 22
},
{
"parent": {
"child": [
{
"conc_2": 6,
"conc_1": 5,
"comp_id": 25,
"comp_name": "Water"
},
{
"conc_2": 1,
"con2_1": 14,
"comp_id": 129,
"comp_name": "cdf"
},
{
"conc_2": 10,
"conc_1": 7,
"comp_id": 1686,
"comp_name": "Mag"
}
]
},
"pnumber": 207,
"@timestamp": "2017-07-19T08:33:30.890Z",
"s_name": "KMW",
"@version": "1",
"mnumber": 21,
"s_id": 22913
}

And so on....

now, I just did a quick search using Kibana. When i search for say give me all comp_id:129, I get all the records along with other comp_ids too as they are in array. Is there any better way to search or organize mapping??

fbaligand · July 20, 2017, 5:57am

If you use nested type for your parent/child need, that is normal.
Kibana queries elasticsearch which returns all documents which match your query.
But for all matches, the whole elasticsearch document is returned, so you see also other array entries (in the same document).

If you want only children which match your query, you maybe can use parent/child elasticsearch mapping.
It implies that parents and children are different documents. It implies that in your ES response, you get only child info and not parent info.
If you want both, you have to use has_parent query with inner_hits :
https://www.elastic.co/guide/en/elasticsearch/reference/current/search-request-inner-hits.html#parent-child-inner-hits

malhotras · July 20, 2017, 7:00am

Yeah, i guessed so. Just wanted to have a working backend before I jump into the ES complex queries. I guess, Kibana (discover) is not yet there to have refined queries.

My goal is to have parent-child-grandchild relationship in the end. Thanks for your suggestions so far. Will keep you updated!

fbaligand · July 20, 2017, 8:33am

In kibana, You can do has_parent queries with inner_hits.
But to see parent data in kibana, you have to expand a result document and click on "JSON" view to see parent inner hit

malhotras · July 20, 2017, 9:53am

Ok! I need to ask you something with aggregate plugin. Can we make parent-child-grandchild structure? I tried it, but, does not work yet. This is what i have tried so far.

map['parent'] ||= {}    
     map['parent]['child'] ||= []
          map['parent']['child']['grandchild'] ||= []
          map['parent']['child']['grandchild'] << {'conc_1' => event.get('conc_1'),'conc_2' => event.get('conc_2')}

fbaligand · July 21, 2017, 8:30am

Sure, you can do this using aggregate plugin.

The problem in your code is this line :
map['parent]['child'] ||= []

If you set an array, then in the next line, you can't set a 'grandchild' field as if 'child' was a structure {}.

So either you set map['parent]['child'] ||= {}
either you insert 'grandchild' as an array entry : map['parent']['child'] << {'grandchild': []}

malhotras · August 11, 2017, 11:55am

Hi again,

So, now I have extensively tested the parent-child-grandchild relationship etc. Works great! I want to ask something additionally and dont want to open new issue for it.

When we map the attributes to array, is it possible to do it for all the attributes automatically without writing the name individually. so what i mean...

map['parent']['child']['grandchild'] << {'conc_1' => event.get('conc_1'),'conc_2' => event.get('conc_2')}

Here instead of writing name of conc_1 and conc_2, individually i can map all the attributes automatically

fbaligand · August 11, 2017, 4:47pm

This should do the job :

map['parent']['child']['grandchild'] << event.to_hash

malhotras · August 18, 2017, 6:55am

Thanks again. I have some other issue to resolve. Posted it here https://stackoverflow.com/questions/45750086/change-null-value-from-database-for-one-field-using-logstash-and-insert-into-ela

If you can help...

fbaligand · August 18, 2017, 7:43am

As it is a very different issue, I invite you to open a new topic for that, and notify me, with @fbaligand

system · September 15, 2017, 7:43am

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Define Parent-Child Relationship in Logstash Logstash	3	5218	July 6, 2017
How to mention parent child relation in logstash Logstash	3	3190	July 6, 2017
Parentchild relationship in logstash Logstash	3	1742	October 17, 2017
Parent-Child Relationship with logstash Logstash	6	1922	July 6, 2017
Elasticsearch output does not support parent-child indexing Logstash	3	2188	July 6, 2017

Index parent child relation using Logstash (one to many) Configuration issue

Related topics