Question about multiple inputs

xibal · October 23, 2019, 6:45pm

Hi, I'm migrating orders and their itemsin a joined SQL and using aggregate filter to list items as a nested field under each order. Each document's id is the respective orders table id.

Now, in another table I have order_shipments and another one for order_history.
Both tables have an order_id column that I want to find in my index and add shipments and history nested fields.

I know you can have multiple jdbc {} blocks, but would I be able to make such an update to the existing documents in the index? If now, how can I achieve this?

AquaX · October 23, 2019, 8:20pm

When you define the elasticseach output you can also set the "_id" field manually. If you insert a document with an "_id" that already exists then it will be updated.

https://www.elastic.co/guide/en/logstash/current/plugins-outputs-elasticsearch.html#plugins-outputs-elasticsearch-document_id

Also, please try not to use nested fields. They just created headaches.

xibal · October 23, 2019, 8:25pm

Thanks for the hint! In this case though, I've got a JSON column called order_metadata for example, which is a growing/shrinking list of key values. How would you implement something like that without the nested type?

Another example is, order_notes or discount_list their values from the main storage look like:
[{"name":"Discount","code":"","amount":"22.00"}]

AquaX · October 23, 2019, 8:44pm

I'm not saying you can't use nested field, it just makes dealing with Elasticsearch and Kibana more difficult as you have to create nested data types and do nested aggregations and if you don't you can get strange results.

What happens in the background is that lucene flattens the JSON object and you lose the relationship between the fields in the array since each of they keys and values are put into their own arrays. So when you go to search you may get more results of documents. For example:

You have two documents:

{
"transaction":"ABC123",
"discount" : [
    {"name":"DiscountA","code":"","amount":"4.00"},
    {"name":"Coupon","code":"","amount":"22.00"}
    ]
}

{
"transaction":"DEF456",
"discount" : [
    {"name":"DiscountB","code":"","amount":"22.00"}
    ]
}

If you write a query:

{
"query": {
          "bool": {
                  "must": [
                           { "match": { "discount.name": "Discount" } },
                           { "match": { "discount.amount": "22.00" } }
                    ]
              }
      }
}

You would expect to only get document DEF456 but instead you will get both documents. Try it.

xibal · October 23, 2019, 9:02pm

I see your point, indeed. The kind of queries we would make use of this field would always include the top level document's id btw. (order id) (in your example it's equal to having the "transaction_id" in the query and we would be more interested in counting the total of the "amount" field you see there for example, or distinct "codes" found for a document.

I guess what we could do as another option would be saving a custom top level field for the discount totals per document? Not sure about the "code" => "amount" relationship though...

Another very common search we do is looking for a specific key's value in the order_metadata, which looks like this:
{"custom":"foo","another":"835-407-ABC","shipped":"2019-09-25 17:12:26"}

The keys for this field differ in every document. We'd even need to raise the default 10K limit I think because we allow any amount of key-value pairs for this.

We would be searching for example; whether "custom" was set as "foo" or whether the "shipped" exists for a given document id.

system · November 20, 2019, 9:02pm

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Insert multiple jdbc rows into the same ES document under nested field Logstash	1	275	June 24, 2020
Ingest SQL relationships as nested documents Logstash	4	626	June 15, 2018
Importing into nested document using logstash Logstash	1	1440	September 22, 2017
Append to nested type object from multiple jdbc input stream using logstash aggregation Logstash	4	902	September 19, 2019
Im a bit confused on JDBC Indexes/document_types Elasticsearch	8	1778	April 11, 2017

Question about multiple inputs

Related topics