Need help to evaluate Logstash for syncing Oracle data into Elasticsearch (combining multiple 25+ table records into single nested document)

We are planning to use Logstash to sync data from Oracle to Elasticsearch.
There are around 25+ tables in Oracle. Plan is to to have single index in Elasticsearch with nested/flattened documents which will contain data from all associated tables.
We are evaluating if Logstash can be used - to get records from Oracle (by joining all tables), transform into nested structure and then load into Elasticsearch. Oracle tables can have one to many relationship, so corresponding data needs to be structured as list of objects in JSON document. And this structure can become very complex.

  1. We would like to know if Logstash will be suitable for our requirement and will be performant enough to load large volume of data.
  2. How can we scale Logstash to improve transformation and indexing performance?
  3. If tracking column is not present in associated child tables, then how will it sync data? Will it be required to write trigger to update tracking column (may be updated date) in parent table (so that Logstash will start sync process)?

You could make use of Elastic's update, upsert, or scripted upsert functionality to combine data from different sources into a single destination.

Examples of scripted upserts executed by Logstash can be found in the following blogs:

In conjunction with scripted upserts it may be necessary to iterate over all fields in the parameters that are passed in. You may find the follow article useful for guidance on how to do this in painless:

You may also find the following article to be relevant:

You may also wish to review the following comment about search-time operations:

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.