Jdbc river update document with new fields

Hi,

The source for our documents is a database with several databaseview. Each
view contains specific attributes for a product that only apply to certain
groups of products (book has an attribute 'binding' and dvd has the
attribute 'playtime'. There is 1 databaseview with the 'core'-attributes of
a product. All views contain the productId to be able to join on.

I'm using the jdbc-river to bulk index all databaseviews. The core-dbview
is no problem. However i was hoping that when i index another databaseview,
the (relevant) documents in the index would get update with the new fields.
So for example there is an attribute 'author' which resides in the
book-attributed-view but not in the core-view. When indexing it should
update the core (book) documents and 'append' the 'author' field. However
it does not seem to work. I'm using select productId as "_id" for
indexing

Querying a document on the id either returns only core-attributes or the
author-attribute but not both. Getting the mapping (via the head-plugin)
seems to be correct, the author field is returned as part of the mapping.

Any suggestions?
Maarten

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

Can you give a short example what docs are in your index, the mapping, and
how you query the index, for better understanding what you want to achieve?

Jörg

On Fri, Jul 19, 2013 at 7:26 PM, Maarten Roosendaal <mroosendaal14@gmail.com

wrote:

Hi,

The source for our documents is a database with several databaseview. Each
view contains specific attributes for a product that only apply to certain
groups of products (book has an attribute 'binding' and dvd has the
attribute 'playtime'. There is 1 databaseview with the 'core'-attributes of
a product. All views contain the productId to be able to join on.

I'm using the jdbc-river to bulk index all databaseviews. The core-dbview
is no problem. However i was hoping that when i index another databaseview,
the (relevant) documents in the index would get update with the new fields.
So for example there is an attribute 'author' which resides in the
book-attributed-view but not in the core-view. When indexing it should
update the core (book) documents and 'append' the 'author' field. However
it does not seem to work. I'm using select productId as "_id" for
indexing

Querying a document on the id either returns only core-attributes or the
author-attribute but not both. Getting the mapping (via the head-plugin)
seems to be correct, the author field is returned as part of the mapping.

Any suggestions?
Maarten

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

Hi,

Here's an example:
with the 'core' databaseview you can index products like:

{
"productName": "aaaa",
"productID": "1234",
"warehouse": "DD",
"vendor": "X",
"categories": "hardware,tools"
}

in the second databaseview (description)
{
"productID": "1234",
"productDescription": "somethinig",
}

So the second view would supplement the product with Id '1234' with a new
field 'productDescription. All the products with id's that are also in the
desciption view get overriden and just have the productDescription field.

I hope there is a way to do this.
THanks,
Maarten

Op zaterdag 20 juli 2013 18:31:34 UTC+2 schreef Jörg Prante het volgende:

Can you give a short example what docs are in your index, the mapping, and
how you query the index, for better understanding what you want to achieve?

Jörg

On Fri, Jul 19, 2013 at 7:26 PM, Maarten Roosendaal <mroose...@gmail.com<javascript:>

wrote:

Hi,

The source for our documents is a database with several databaseview.
Each view contains specific attributes for a product that only apply to
certain groups of products (book has an attribute 'binding' and dvd has the
attribute 'playtime'. There is 1 databaseview with the 'core'-attributes of
a product. All views contain the productId to be able to join on.

I'm using the jdbc-river to bulk index all databaseviews. The core-dbview
is no problem. However i was hoping that when i index another databaseview,
the (relevant) documents in the index would get update with the new fields.
So for example there is an attribute 'author' which resides in the
book-attributed-view but not in the core-view. When indexing it should
update the core (book) documents and 'append' the 'author' field. However
it does not seem to work. I'm using select productId as "_id" for
indexing

Querying a document on the id either returns only core-attributes or the
author-attribute but not both. Getting the mapping (via the head-plugin)
seems to be correct, the author field is returned as part of the mapping.

Any suggestions?
Maarten

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearc...@googlegroups.com <javascript:>.
For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

Have you thought about using index types?

Index "products", Type "core", Id "1234" and Index"products", Type
"description", Id "descriptionId"

If you query the index "products", and filter on a field with the product
ID, you can get all the descriptions back belonging to a product ID.

Jörg

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

Hi,

Yes, that has crossed my mind and is still an option. But a product gets
it's information out of 7 or 8 views and i don't want to complicate it too
much. I rather have 1 products-index. Plus an additional complexity is that
we have several types of product and each type has it's own set of
attributes. For example 'NrOfPages' is relevant for a book but not for
laptop.

I was hoping to do this with the jdbc-river, otherwise i either have to
'pre'-join the information, do a complex join-query with the jdbc-river or
do what you suggest.

Thanks,
Maarten

Op vrijdag 26 juli 2013 11:29:44 UTC+2 schreef Jörg Prante het volgende:

Have you thought about using index types?

Index "products", Type "core", Id "1234" and Index"products", Type
"description", Id "descriptionId"

If you query the index "products", and filter on a field with the product
ID, you can get all the descriptions back belonging to a product ID.

Jörg

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

what does work (without jdbc-river) is:
http://www.elasticsearch.org/guide/reference/api/update/ --> see "add a new
field", so if i could somehow combine

select productId as "_id", DESCRIPTION as "ctx._source.description" from
DESCRIPTIONS_VW"

Op maandag 29 juli 2013 12:43:14 UTC+2 schreef Maarten Roosendaal het
volgende:

Hi,

Yes, that has crossed my mind and is still an option. But a product gets
it's information out of 7 or 8 views and i don't want to complicate it too
much. I rather have 1 products-index. Plus an additional complexity is that
we have several types of product and each type has it's own set of
attributes. For example 'NrOfPages' is relevant for a book but not for
laptop.

I was hoping to do this with the jdbc-river, otherwise i either have to
'pre'-join the information, do a complex join-query with the jdbc-river or
do what you suggest.

Thanks,
Maarten

Op vrijdag 26 juli 2013 11:29:44 UTC+2 schreef Jörg Prante het volgende:

Have you thought about using index types?

Index "products", Type "core", Id "1234" and Index"products", Type
"description", Id "descriptionId"

If you query the index "products", and filter on a field with the product
ID, you can get all the descriptions back belonging to a product ID.

Jörg

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

JDBC river uses bulk indexing for performance. In bulk mode, it is
difficult to support updates.

I know it does not help you yet, but the challenge of building complex
documents out of simple documents is not new. I work on a dereferencer
plugin, independent from JDBC, instead using Linked Data techniques. With
that plugin you will be able to embed IRIs (international resource
identifiers) as JSON values (like in JSON-LD) which are dereferenced for
content at the time they are indexed. This can be other docs in the index,
or RDF resources from remote sources. Think of a graph of resources getting
combined into one indexable document. The price is less indexing
performance.

Jörg

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.