[ANN] JDBC river for Elasticsearch 2.0.0

jprante · January 17, 2013, 11:30pm

Hey,

JDBC river plugin for Elasticsearch 2.0.0 is out.

Glimpsing the features:

for Elasticsearch 0.20+
more stable, tested against Derby, H2, HSQLDB, MySQL, PostgreSQL
example code and SQL in the testng tests
new conceptual model: "simple" strategy, "oneshot" strategy, "table"
strategy
strategy consists of three modules for better reuse and extensibility
(also in other rivers): RiverSource, RiverMouth, RiverFlow
new bulk indexing, showcase of org.elasticsearch.action.bulk.BulkProcessor
many bug fixes
binaries moved to bintray

https://bintray.com/pkg/show/general/jprante/elasticsearch-plugins/elasticsearch-river-jdbc

Documentation is being reorganized on the github wiki, more to come!

Cheers,

Jörg

--

Gabriel_Mirea · January 25, 2013, 12:45pm

Hi Jörg,

1)I' m using a Oracle XE server on a win7 x64, where I've done:
1.1)unzip elasticsearch-0.20.2.zip
1.2)copy ojdbc14.jar to %ES_HOME%/lib/ojdbc14.jar
1.3)./bin/plugin -url http://bit.ly/U75w1N -install river-jdbc

Within oracle squirrel client:
2.1)Create Table ORDERS(
ORDER_ID Number(7) Primary Key,
description VARCHAR2(100));

Create Table OPTIONS(
OPTION_ID Number(7) Primary Key,
order_id Number(7),
option_name VARCHAR2(50),
CONSTRAINT fk_order
FOREIGN KEY (order_id)
REFERENCES ORDERS(order_id))

INSERT INTO ORDERS(ORDER_ID, description) values(1, '1st order');

INSERT INTO OPTIONS(OPTION_ID, ORDER_ID, option_name) values(1, 1, '1st
order option1');
INSERT INTO OPTIONS(OPTION_ID, ORDER_ID, option_name) values(2, 1, '1st
order option2');

2.2)sqplus->

SQL> select * from orders;

ORDER_ID DESCRIPTION

     1 1st order

SQL> select * from options;

OPTION_ID ORDER_ID OPTION_NAME

     1          1 1st order option1
     2          1 1st order option2

SQL> select 'order' as "_index", ord.order_id as "_id",ord.order_id as
"order.oId",option_name as "order.options" from orders ord inner join
options opt on ord.order_id = opt.order_id;

_index _id order.oId order.options

order 1 1 1st order option1
order 1 1 1st order option2

On a ubuntu VM box, hosted on the same win7 x64, I executed:
ubuntu@ubuntu-VirtualBox:~$ uname -a
Linux ubuntu-VirtualBox 3.2.0-24-generic-pae #37-Ubuntu SMP Wed Apr 25
10:47:59 UTC 2012 i686 i686 i386 GNU/Linux

ubuntu@ubuntu-VirtualBox:~$ curl -XPUT
'192.168.56.1:9200/_river/my_jdbc_river/_meta' -d '{
"type" : "jdbc",
"jdbc" : {
"driver" : "oracle.jdbc.OracleDriver",
"url" : "jdbc:oracle:thin:@localhost:1521/XE",
"user" : "",
"password" : "",
"poll" : "10s",
"sql" : "select \u0027order\u0027 as "_index", ord.order_id as
"_id",ord.order_id as "order.oId",option_name as "order.option" from
orders ord inner join

options opt on ord.order_id = opt.order_id"
}
}'

4)I'm searching for:
ubuntu@ubuntu-VirtualBox:~$ curl -XGET
'192.168.56.1:9200/order/jdbc/_search?pretty'
{
"took" : 2,
"timed_out" : false,
"_shards" : {
"total" : 5,
"successful" : 5,
"failed" : 0
},
"hits" : {
"total" : 1,
"max_score" : 1.0,
"hits" : [ {
"_index" : "order",
"_type" : "jdbc",
"_id" : "1",
"_score" : 1.0, "_source" : {"order":{"oId":1,"option":"1st order
option2"},"_id":1,"_index":"order"}
} ]
}

Obviously is fetching only the 2nd record from the one-to-many relation,
the "1st order option1" is missing...where I've done wrong?

5)But I would like to have the following result:

"_score" : 1.0, "_source" : {"order":{"oId":1,"options":["1st order
option1","1st order option2"]},"_id":1,"_index":"order"}

that should be an equivalent result from your example:

index=relations id=Good
{"contact":{"employee":["Müller","Meier","Schulze"],"customer":"Good"}}

Can you please help me?with any hint?
Thanks
gabriel

--

dadoonet · January 25, 2013, 4:05pm

If _id is the ID of your doc in Elasticsearch, your second record is erasing the first one as they have the same _id.

Le 25 janv. 2013 à 13:45, Gabriel Mirea gmirea@sqli.com a écrit :

Hi Jörg,

1)I' m using a Oracle XE server on a win7 x64, where I've done:
1.1)unzip elasticsearch-0.20.2.zip
1.2)copy ojdbc14.jar to %ES_HOME%/lib/ojdbc14.jar
1.3)./bin/plugin -url http://bit.ly/U75w1N -install river-jdbc

Within oracle squirrel client:
2.1)Create Table ORDERS(
ORDER_ID Number(7) Primary Key,
description VARCHAR2(100));

Create Table OPTIONS(
OPTION_ID Number(7) Primary Key,
order_id Number(7),
option_name VARCHAR2(50),
CONSTRAINT fk_order
FOREIGN KEY (order_id)
REFERENCES ORDERS(order_id))

INSERT INTO ORDERS(ORDER_ID, description) values(1, '1st order');

INSERT INTO OPTIONS(OPTION_ID, ORDER_ID, option_name) values(1, 1, '1st order option1');
INSERT INTO OPTIONS(OPTION_ID, ORDER_ID, option_name) values(2, 1, '1st order option2');

2.2)sqplus->

SQL> select * from orders;

ORDER_ID DESCRIPTION
     1		1st order
SQL> select * from options;

OPTION_ID ORDER_ID OPTION_NAME
     1          1 1st order option1
     2          1 1st order option2
SQL> select 'order' as "_index", ord.order_id as "_id",ord.order_id as "order.oId",option_name as "order.options" from orders ord inner join
options opt on ord.order_id = opt.order_id;

_index _id order.oId order.options

order 1 1 1st order option1
order 1 1 1st order option2

On a ubuntu VM box, hosted on the same win7 x64, I executed:
ubuntu@ubuntu-VirtualBox:~$ uname -a
Linux ubuntu-VirtualBox 3.2.0-24-generic-pae #37-Ubuntu SMP Wed Apr 25 10:47:59 UTC 2012 i686 i686 i386 GNU/Linux

ubuntu@ubuntu-VirtualBox:~$ curl -XPUT '192.168.56.1:9200/_river/my_jdbc_river/_meta' -d '{
"type" : "jdbc",
"jdbc" : {
"driver" : "oracle.jdbc.OracleDriver",
"url" : "jdbc:oracle:thin:@localhost:1521/XE",
"user" : "",
"password" : "",
"poll" : "10s",
"sql" : "select \u0027order\u0027 as "_index", ord.order_id as "_id",ord.order_id as "order.oId",option_name as "order.option" from orders ord inner join

options opt on ord.order_id = opt.order_id"
}
}'

4)I'm searching for:
ubuntu@ubuntu-VirtualBox:~$ curl -XGET '192.168.56.1:9200/order/jdbc/_search?pretty'
{
"took" : 2,
"timed_out" : false,
"_shards" : {
"total" : 5,
"successful" : 5,
"failed" : 0
},
"hits" : {
"total" : 1,
"max_score" : 1.0,
"hits" : [ {
"_index" : "order",
"_type" : "jdbc",
"_id" : "1",
"_score" : 1.0, "_source" : {"order":{"oId":1,"option":"1st order option2"},"_id":1,"_index":"order"}
} ]
}

Obviously is fetching only the 2nd record from the one-to-many relation, the "1st order option1" is missing...where I've done wrong?

5)But I would like to have the following result:

"_score" : 1.0, "_source" : {"order":{"oId":1,"options":["1st order option1","1st order option2"]},"_id":1,"_index":"order"}

that should be an equivalent result from your example:

index=relations id=Good {"contact":{"employee":["Müller","Meier","Schulze"],"customer":"Good"}}

Can you please help me?with any hint?
Thanks
gabriel

--

--

Umit_Seren · January 26, 2013, 10:50am

Great thanks for the plugin.
Just a small question. Is there a way how to craete parent-child entries
with the JDBC River plugin? At least I wasn't able to come up with a
solution with the previous version (haven't really looked into the new
version tough).

On Friday, January 18, 2013 12:30:16 AM UTC+1, Jörg Prante wrote:

Hey,

JDBC river plugin for Elasticsearch 2.0.0 is out.

GitHub - jprante/elasticsearch-jdbc: JDBC importer for Elasticsearch

Glimpsing the features:

for Elasticsearch 0.20+

more stable, tested against Derby, H2, HSQLDB, MySQL, PostgreSQL

example code and SQL in the testng tests

new conceptual model: "simple" strategy, "oneshot" strategy, "table"
strategy

strategy consists of three modules for better reuse and extensibility
(also in other rivers): RiverSource, RiverMouth, RiverFlow

new bulk indexing, showcase of
org.elasticsearch.action.bulk.BulkProcessor

many bug fixes

binaries moved to bintray

Service End for Bintray, JCenter, GoCenter, and ChartCenter | JFrog

Documentation is being reorganized on the github wiki, more to come!

Cheers,

Jörg

--

jprante · January 26, 2013, 12:02pm

Yes, you can specify a special column label beginning with a prefix
underscore '_parent' for creating children documents.

The _parent column value corresponds to the bulk document parameter _parent

I just pushed a fix for missing _timestamp handling in the JDBC river.

Best regards,

Jörg

Am 26.01.13 11:50, schrieb Ümit Seren:

Great thanks for the plugin.
Just a small question. Is there a way how to craete parent-child
entries with the JDBC River plugin? At least I wasn't able to come up
with a solution with the previous version (haven't really looked into
the new version tough).

--

Gabriel_Mirea · January 27, 2013, 9:53pm

hmm...conceptually, I understand that records erasing is happening due to
_id overwriting... can you please provide an hint/example how to avoid it?
many thanks,
gabriel

Le vendredi 25 janvier 2013 18:05:26 UTC+2, David Pilato a écrit :

If _id is the ID of your doc in Elasticsearch, your second record is
erasing the first one as they have the same _id.

Le 25 janv. 2013 à 13:45, Gabriel Mirea <gmi...@sqli.com <javascript:>> a
écrit :

Hi Jörg,

1)I' m using a Oracle XE server on a win7 x64, where I've done:
1.1)unzip elasticsearch-0.20.2.zip
1.2)copy ojdbc14.jar to %ES_HOME%/lib/ojdbc14.jar
1.3)./bin/plugin -url http://bit.ly/U75w1N -install river-jdbc

Within oracle squirrel client:
2.1)Create Table ORDERS(
ORDER_ID Number(7) Primary Key,
description VARCHAR2(100));

Create Table OPTIONS(
OPTION_ID Number(7) Primary Key,
order_id Number(7),
option_name VARCHAR2(50),
CONSTRAINT fk_order
FOREIGN KEY (order_id)
REFERENCES ORDERS(order_id))

INSERT INTO ORDERS(ORDER_ID, description) values(1, '1st order');

INSERT INTO OPTIONS(OPTION_ID, ORDER_ID, option_name) values(1, 1, '1st
order option1');
INSERT INTO OPTIONS(OPTION_ID, ORDER_ID, option_name) values(2, 1, '1st
order option2');

2.2)sqplus->

SQL> select * from orders;

ORDER_ID DESCRIPTION
     1                1st order 
SQL> select * from options;

OPTION_ID ORDER_ID OPTION_NAME
     1          1 1st order option1 
     2          1 1st order option2 
SQL> select 'order' as "_index", ord.order_id as "_id",ord.order_id as
"order.oId",option_name as "order.options" from orders ord inner join
options opt on ord.order_id = opt.order_id;

_index _id order.oId order.options
order 1 1 1st order option1
order 1 1 1st order option2

On a ubuntu VM box, hosted on the same win7 x64, I executed:
ubuntu@ubuntu-VirtualBox:~$ uname -a
Linux ubuntu-VirtualBox 3.2.0-24-generic-pae #37-Ubuntu SMP Wed Apr 25
10:47:59 UTC 2012 i686 i686 i386 GNU/Linux

ubuntu@ubuntu-VirtualBox:~$ curl -XPUT '
192.168.56.1:9200/_river/my_jdbc_river/_meta' -d '{
"type" : "jdbc",
"jdbc" : {
"driver" : "oracle.jdbc.OracleDriver",
"url" : "jdbc:oracle:thin:@localhost:1521/XE",
"user" : "",
"password" : "",
"poll" : "10s",
"sql" : "select \u0027order\u0027 as "_index", ord.order_id as
"_id",ord.order_id as "order.oId",option_name as "order.option" from
orders ord inner join

options opt on ord.order_id = opt.order_id"
}
}'

4)I'm searching for:
ubuntu@ubuntu-VirtualBox:~$ curl -XGET '
192.168.56.1:9200/order/jdbc/_search?pretty'
{
"took" : 2,
"timed_out" : false,
"_shards" : {
"total" : 5,
"successful" : 5,
"failed" : 0
},
"hits" : {
"total" : 1,
"max_score" : 1.0,
"hits" : [ {
"_index" : "order",
"_type" : "jdbc",
"_id" : "1",
"_score" : 1.0, "_source" : {"order":{"oId":1,"option":"1st order
option2"},"_id":1,"_index":"order"}
} ]
}

Obviously is fetching only the 2nd record from the one-to-many relation,
the "1st order option1" is missing...where I've done wrong?

5)But I would like to have the following result:

"_score" : 1.0, "_source" : {"order":{"oId":1,"options":["1st order
option1","1st order option2"]},"_id":1,"_index":"order"}

that should be an equivalent result from your example:

index=relations id=Good
{"contact":{"employee":["Müller","Meier","Schulze"],"customer":"Good"}}

Can you please help me?with any hint?
Thanks
gabriel

--

--

jprante · January 28, 2013, 8:41am

Hi Gabriel,

Thanks for your example. It should work. I have to check what is going
wrong and opened an issue for it.

github.com/jprante/elasticsearch-jdbc

oracle object merge

opened 08:40AM - 28 Jan 13 UTC

closed 12:14PM - 19 Oct 14 UTC

jprante

bug

1)I' m using a Oracle XE server on a win7 x64, where I've done: 1.1)unzip elasti…csearch-0.20.2.zip 1.2)copy ojdbc14.jar to %ES_HOME%/lib/ojdbc14.jar 1.3)./bin/plugin -url http://bit.ly/U75w1N -install river-jdbc 2) Within oracle squirrel client: 2.1)Create Table ORDERS( ORDER_ID Number(7) Primary Key, description VARCHAR2(100)); Create Table OPTIONS( OPTION_ID Number(7) Primary Key, order_id Number(7), option_name VARCHAR2(50), CONSTRAINT fk_order FOREIGN KEY (order_id) REFERENCES ORDERS(order_id)) INSERT INTO ORDERS(ORDER_ID, description) values(1, '1st order'); INSERT INTO OPTIONS(OPTION_ID, ORDER_ID, option_name) values(1, 1, '1st order option1'); INSERT INTO OPTIONS(OPTION_ID, ORDER_ID, option_name) values(2, 1, '1st order option2'); 2.2)sqplus-> SQL> select \* from orders; ORDER_ID DESCRIPTION --- ``` 1 1st order ``` SQL> select \* from options; OPTION_ID ORDER_ID OPTION_NAME --- ``` 1 1 1st order option1 2 1 1st order option2 ``` SQL> select 'order' as "_index", ord.order_id as "_id",ord.order_id as "order.oId",option_name as "order.options" from orders ord inner join options opt on ord.order_id = opt.order_id; _index _id order.oId order.options --- order 1 1 1st order option1 order 1 1 1st order option2 3) On a ubuntu VM box, hosted on the same win7 x64, I executed: ubuntu@ubuntu-VirtualBox:~$ uname -a Linux ubuntu-VirtualBox 3.2.0-24-generic-pae #37-Ubuntu SMP Wed Apr 25 10:47:59 UTC 2012 i686 i686 i386 GNU/Linux ubuntu@ubuntu-VirtualBox:~$ curl -XPUT '192.168.56.1:9200/_river/my_jdbc_river/_meta' -d '{ "type" : "jdbc", "jdbc" : { "driver" : "oracle.jdbc.OracleDriver", "url" : "jdbc:oracle:thin:@localhost:1521/XE", "user" : "", "password" : "", "poll" : "10s", "sql" : "select \u0027order\u0027 as \"_index\", ord.order_id as \"_id\",ord.order_id as \"order.oId\",option_name as \"order.option\" from orders ord inner join options opt on ord.order_id = opt.order_id" } }' 4)I'm searching for: ubuntu@ubuntu-VirtualBox:~$ curl -XGET '192.168.56.1:9200/order/jdbc/_search?pretty' { "took" : 2, "timed_out" : false, "_shards" : { "total" : 5, "successful" : 5, "failed" : 0 }, "hits" : { "total" : 1, "max_score" : 1.0, "hits" : [ { "_index" : "order", "_type" : "jdbc", "_id" : "1", "_score" : 1.0, "_source" : {"order":{"oId":1,"option":"1st order option2"},"_id":1,"_index":"order"} } ] } Obviously is fetching only the 2nd record from the one-to-many relation, the "1st order option1" is missing...where I've done wrong? 5)But I would like to have the following result: "_score" : 1.0, "_source" : {"order":{"oId":1,"options":["1st order option1","1st order option2"]},"_id":1,"_index":"order"} that should be an equivalent result from your example: index=relations id=Good {"contact":{"employee":["Müller","Meier","Schulze"],"customer":"Good"}} Can you please help me?with any hint? Thanks gabriel

Best reagrds,

JÃ¶rg

Am 27.01.13 22:53, schrieb Gabriel Mirea:

hmm...conceptually, I understand that records erasing is happening
due to _id overwriting... can you please provide an hint/example how
to avoid it?
many thanks,
gabriel

If _id is the ID of your doc in Elasticsearch, your second record
is erasing the first one as they have the same _id.


Le 25 janv. 2013 Ã  13:45, Gabriel Mirea <gmi...@sqli.com
<javascript:>> a Ã©crit :

> Hi JÃ¶rg,
>
> 1)I' m using a Oracle XE server on a win7 x64, where I've done:
> 1.1)unzip elasticsearch-0.20.2.zip
> 1.2)copy ojdbc14.jar to %ES_HOME%/lib/ojdbc14.jar
> 1.3)./bin/plugin -url http://bit.ly/U75w1N -install river-jdbc
>
> 2) Within oracle squirrel client:
> 2.1)Create Table ORDERS(
> ORDER_ID Number(7) Primary Key,
> description VARCHAR2(100));
>
> Create Table OPTIONS(
> OPTION_ID Number(7) Primary Key,
> order_id Number(7),
> option_name VARCHAR2(50),
> CONSTRAINT fk_order
>     FOREIGN KEY (order_id)
>     REFERENCES ORDERS(order_id))
>
> INSERT INTO ORDERS(ORDER_ID, description) values(1, '1st order');
>
> INSERT INTO OPTIONS(OPTION_ID, ORDER_ID, option_name) values(1,
1, '1st order option1');
> INSERT INTO OPTIONS(OPTION_ID, ORDER_ID, option_name) values(2,
1, '1st order option2');
>
> 2.2)sqplus->
>
>
> SQL> select * from orders;
>
>   ORDER_ID   DESCRIPTION
> ----------  
--------------------------------------------------------------------------------

>          1                1st order
>
> SQL> select * from options;
>
>  OPTION_ID   ORDER_ID OPTION_NAME
> ---------- ----------
--------------------------------------------------
>          1          1 1st order option1
>          2          1 1st order option2
>
> SQL> select 'order' as "_index", ord.order_id as
"_id",ord.order_id as "order.oId",option_name as "order.options"
from orders ord inner join
> options opt on ord.order_id = opt.order_id;
>
> _index        _id  order.oId order.options
> ------ ---------- ----------
--------------------------------------------------
> order           1          1 1st order option1
> order           1          1 1st order option2
>
>
> 3) On a ubuntu VM box, hosted on the same win7 x64, I executed:
> ubuntu@ubuntu-VirtualBox:~$ uname -a
> Linux ubuntu-VirtualBox 3.2.0-24-generic-pae #37-Ubuntu SMP Wed
Apr 25 10:47:59 UTC 2012 i686 i686 i386 GNU/Linux
>
> ubuntu@ubuntu-VirtualBox:~$ curl -XPUT
'192.168.56.1:9200/_river/my_jdbc_river/_meta
<http://192.168.56.1:9200/_river/my_jdbc_river/_meta>' -d '{
>     "type" : "jdbc",
>     "jdbc" : {
>         "driver" : "oracle.jdbc.OracleDriver",
>         "url" : "jdbc:oracle:thin:@localhost:1521/XE",
>         "user" : "",
>         "password" : "",
>         "poll" : "10s",
>         "sql" : "select \u0027order\u0027 as \"_index\",
ord.order_id as \"_id\",ord.order_id as \"order.oId\",option_name
as \"order.option\" from orders ord inner join
>
> options opt on ord.order_id = opt.order_id"
>     }
> }'
>
> 4)I'm searching for:
> ubuntu@ubuntu-VirtualBox:~$ curl -XGET
'192.168.56.1:9200/order/jdbc/_search?pretty
<http://192.168.56.1:9200/order/jdbc/_search?pretty>'
> {
>   "took" : 2,
>   "timed_out" : false,
>   "_shards" : {
>     "total" : 5,
>     "successful" : 5,
>     "failed" : 0
>   },
>   "hits" : {
>     "total" : 1,
>     "max_score" : 1.0,
>     "hits" : [ {
>       "_index" : "order",
>       "_type" : "jdbc",
>       "_id" : "1",
>       "_score" : 1.0, "_source" :
{"order":{"oId":1,"option":"1st order
option2"},"_id":1,"_index":"order"}
>     } ]
>   }
>
> Obviously is fetching only the 2nd record from the one-to-many
relation, the "1st order option1" is missing...where I've done wrong?
>
> 5)But I would like to have the following result:
>
> "_score" : 1.0, "_source" : {"order":{"oId":1,"options":["1st
order option1","1st order option2"]},"_id":1,"_index":"order"}
>
> that should be an equivalent result from your example:
>
>
> index=relations id=Good
{"contact":{"employee":["MÃ¼ller","Meier","Schulze"],"customer":"Good"}}

>
> Can you please help me?with any hint?
> Thanks
> gabriel
>
> --
>
>

--

carlozlopes · March 6, 2013, 3:38pm

Hi everybody,

I'm a newbie in elasticsearch and I'm trying to make a mapping between some
mysql tables and elasticsearch. For this purpuse I use river-jdbc but I am
having some issues to represent (foreign keys).
After some search over the internet I found that there's an option for
elasticsearch to represent a parent-child link which could be the best
aproach to define a "foreign key" in elasticsearch.

The problem that I am facing is that even if I have read some explanations
of how to use the _parent prefix I don't know how to implement it in a
river-jdbc query in order to preserve mysql's structure represented in a
parent-child structure in elasticsearch.

The below JSON representation is what I intend to do using a river-jdbc
query.

document = {
"docid": 123456789,
"text": "This is an ingredient description",
"category": [{
"catid": '124122151',
"name": "avocato",
"product": [
{"prodid": "12452361"},
{"name": "Bio avacato"}
]
},
{
"catid": '124122153',
"name": "carrot",
"product": [
{"prodid": "12452362"},
{"name": "Bio carrot"}
]
}],
}

On Saturday, January 26, 2013 1:02:36 PM UTC+1, Jörg Prante wrote:

Yes, you can specify a special column label beginning with a prefix
underscore '_parent' for creating children documents.

Labeled columns · jprante/elasticsearch-jdbc Wiki · GitHub

The _parent column value corresponds to the bulk document parameter
_parent

Elasticsearch Platform — Find real-time answers at scale | Elastic

I just pushed a fix for missing _timestamp handling in the JDBC river.

Best regards,

Jörg

Am 26.01.13 11:50, schrieb Ümit Seren:

Great thanks for the plugin.
Just a small question. Is there a way how to craete parent-child
entries with the JDBC River plugin? At least I wasn't able to come up
with a solution with the previous version (haven't really looked into
the new version tough).

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

jprante · March 6, 2013, 11:34pm

The idea is to run the river twice, once for the parents, and once for
the children docs.

It does not matter if parents exist, you can add children nevertheless.
So you can run the children river first and parent second or vice versa.

I think you are aware of it, parent/child in ES is not really a "foreign
key" surrogate. It helps to enhance performance because related
documents are stored nearby and there is a strict 1:n relationship. But,
unlike foreign keys in the relational DB context, there is no contraint
management and no rejection of deletes if your data in the index gets
inconstent.

Does this help?

Jörg

Am 06.03.13 16:38, schrieb Carlos Alberto López Servín:

Hi everybody,

I'm a newbie in elasticsearch and I'm trying to make a mapping between
some mysql tables and elasticsearch. For this purpuse I use
river-jdbc but I am having some issues to represent (foreign keys).
After some search over the internet I found that there's an option for
elasticsearch to represent a parent-child link which could be the best
aproach to define a "foreign key" in elasticsearch.

The problem that I am facing is that even if I have read some
explanations of how to use the _parent prefix I don't know how to
implement it in a river-jdbc query in order to preserve mysql's
structure represented in a parent-child structure in elasticsearch.

The below JSON representation is what I intend to do using a
river-jdbc query.

document = {
"docid": 123456789,
"text": "This is an ingredient description",
"category": [{
"catid": '124122151',
"name": "avocato",
"product": [
{"prodid": "12452361"},
{"name": "Bio avacato"}
]
},
{
"catid": '124122153',
"name": "carrot",
"product": [
{"prodid": "12452362"},
{"name": "Bio carrot"}
]
}],
}

On Saturday, January 26, 2013 1:02:36 PM UTC+1, Jörg Prante wrote:
Yes, you can specify a special column label beginning with a prefix
underscore '_parent' for creating children documents.

https://github.com/jprante/elasticsearch-river-jdbc/wiki/Labeled-columns
<https://github.com/jprante/elasticsearch-river-jdbc/wiki/Labeled-columns>


The _parent column value corresponds to the bulk document
parameter _parent

http://www.elasticsearch.org/guide/reference/api/bulk.html
<http://www.elasticsearch.org/guide/reference/api/bulk.html>

I just pushed a fix for missing _timestamp handling in the JDBC
river.

Best regards,

Jörg

Am 26.01.13 11:50, schrieb Ümit Seren:
> Great thanks for the plugin.
> Just a small question. Is there a way how to craete parent-child
> entries with the JDBC River plugin? At least I wasn't able to
come up
> with a solution with the previous version (haven't really looked
into
> the new version tough).
>
>
--
You received this message because you are subscribed to the Google
Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send
an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

Topic		Replies	Views
[ANNOUNCEMENT] JDBC River 2.2.0 for Elasticsearch 0.90+ Elasticsearch	2	408	July 6, 2017
[Ann] JDBC River Plugin for ElasticSearch Elasticsearch	20	2695	July 6, 2017
[ANN] JDBC river 2.3.0 Elasticsearch	1	370	July 6, 2017
[ANN] JDBC river 1.0.0.1 released Elasticsearch	6	384	July 6, 2017
[ANN] JDBC river/feeder plugin 1.2.2.0 Elasticsearch	1	375	July 6, 2017

[ANN] JDBC river for Elasticsearch 2.0.0

ORDER_ID DESCRIPTION

ORDER_ID DESCRIPTION

Related topics