Hi everybody,
I have multiple tables in my database which I would like to import into Elasticsearch using a single document type. One of the tables has a one-to-many relation to another table.
I will use the example as provided in the guide to explain the problem, because my problem is similar (https://www.elastic.co/guide/en/elasticsearch/guide/current/nested-objects.html). So we have blogposts (in one table) and each blogpost can have multiple comments (in another table, linked with a one-to-many relation). Representing comments as Arrays of Inner Objects does not work for us, because we would loose the correlation between the different attributes of a comment as stated in the Guide. A solution is the use of Nested Objects, by mapping the comments field as type nested.
This would be a valid output to Elasticsearch:
{
{
"comments.id": [ 11 ],
"comments.name": [ john, smith ],
"comments.comment": [ article, great ],
"comments.age": [ 28 ],
"comments.stars": [ 4 ],
"comments.date": [ 2014-09-01 ]
}
{
"comments.id": [ 12 ],
"comments.name": [ alice, white ],
"comments.comment": [ like, more, please, this ],
"comments.age": [ 31 ],
"comments.stars": [ 5 ],
"comments.date": [ 2014-10-22 ]
}
{
"id": [ 1 ],
"title": [ eggs, nest ],
"body": [ making, money, work, your ],
"tags": [ cash, shares ]
}
}
I used the following nested mapping: (mostly obtained from the guide)
PUT /my_index
{
"mappings": {
"blogpost": {
"properties": {
"comments": {
"type": "nested",
"properties": {
"id": { "type": "number" },
"name": { "type": "string" },
"comment": { "type": "string" },
"age": { "type": "short" },
"stars": { "type": "short" },
"date": { "type": "date" }
}
}
}
}
}
}
The statement below shows the relevant configuration to import data from jdbc using logstash:
input {
jdbc {
...
statement => "SELECT * FROM blogpost LEFT JOIN comments ON blogpost.id = comments.id ORDER BY blogpost.id"
...
}
}
However, when looking what has been imported into ES I see that for each comment another event is generated instead of a single event per blogpost with comments as nested objects. I have not found how to implement this functionality, but here (https://github.com/jprante/elasticsearch-jdbc#structured-objects) this functionality is used. How should I proceed with importing to get a single event per blogpost with its comments as nested objects?
Thank you.
Daan