Hi, I'm migrating orders and their itemsin a joined SQL and using aggregate filter to list items as a nested field under each order. Each document's id is the respective orders table id.
Now, in another table I have order_shipments and another one for order_history.
Both tables have an order_id column that I want to find in my index and add shipments and history nested fields.
I know you can have multiple jdbc {} blocks, but would I be able to make such an update to the existing documents in the index? If now, how can I achieve this?
When you define the elasticseach output you can also set the "_id" field manually. If you insert a document with an "_id" that already exists then it will be updated.
Thanks for the hint! In this case though, I've got a JSON column called order_metadata for example, which is a growing/shrinking list of key values. How would you implement something like that without the nested type?
Another example is, order_notes or discount_list their values from the main storage look like: [{"name":"Discount","code":"","amount":"22.00"}]
I'm not saying you can't use nested field, it just makes dealing with Elasticsearch and Kibana more difficult as you have to create nested data types and do nested aggregations and if you don't you can get strange results.
What happens in the background is that lucene flattens the JSON object and you lose the relationship between the fields in the array since each of they keys and values are put into their own arrays. So when you go to search you may get more results of documents. For example:
I see your point, indeed. The kind of queries we would make use of this field would always include the top level document's id btw. (order id) (in your example it's equal to having the "transaction_id" in the query and we would be more interested in counting the total of the "amount" field you see there for example, or distinct "codes" found for a document.
I guess what we could do as another option would be saving a custom top level field for the discount totals per document? Not sure about the "code" => "amount" relationship though...
Another very common search we do is looking for a specific key's value in the order_metadata, which looks like this: {"custom":"foo","another":"835-407-ABC","shipped":"2019-09-25 17:12:26"}
The keys for this field differ in every document. We'd even need to raise the default 10K limit I think because we allow any amount of key-value pairs for this.
We would be searching for example; whether "custom" was set as "foo" or whether the "shipped" exists for a given document id.
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.