For a project I'm currently working on I have two rivers, one JDBC and a
CouchDB river, each going to the same index, but a different type. The
documents comming from this rivers are related to each other: The JDBC
river gives basic product information and the CouchDB gives extended
product information.
Now I want to search for documents based on two fields, one on each type.
As Elasticsearch doesn't support joining documents (at least not for as far
as I can see), I was searching for a way of combining the to rivers into 1
document, but it looks like the documents get overwritten instead of
extended with the new properties.
Is this kind of behavior possible or should I approach this in a different
way?
Definitely you should manage that in an other tool or layer.
For example in a service layer or using an ETL.
Rivers are not designed to aggregate documents from different sources. It's just a helper to start very quickly with elasticsearch but at the end of the day, most of the users prefers to manage that injection process from outside elasticsearch nodes themselves.
For a project I'm currently working on I have two rivers, one JDBC and a CouchDB river, each going to the same index, but a different type. The documents comming from this rivers are related to each other: The JDBC river gives basic product information and the CouchDB gives extended product information.
Now I want to search for documents based on two fields, one on each type. As Elasticsearch doesn't support joining documents (at least not for as far as I can see), I was searching for a way of combining the to rivers into 1 document, but it looks like the documents get overwritten instead of extended with the new properties.
Is this kind of behavior possible or should I approach this in a different way?
Please note that JDBC river is not a product and can not provide product
information.
It is just a helper to quickly load some data into ES, for example, to
convince the boss for a POC within a few hours or days. There is no
intention to replace ETL products.
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.