Tomo_M
(Tomohiro Mitani)
January 21, 2022, 10:41am
8
Hi,
a transform can have multiple sources and they do not need a common schema as long as the field name of the group_by is compatible. So if you have for example a field "customer_id" in both indices "past_purchases" and "predicted_purchases" you can create a transform that joins over both indices:
PUT _transform/prediction_accuracy
{
"source": {
"index": ["past_purchases", "predicted_purchases"]
},
"pivot": {
"group_by": {
"id": {"terms": {
"field": "customer_id"
…
Hi,
you should be able to do this using transform (see https://www.elastic.co/guide/en/elasticsearch/reference/7.4/put-transform.html ). The workflow would look like this:
you index the data into elasticsearch as single docs, this is the "source" index
you create a transform that pulls data from the "source" and aggregates it according to your needs into a "dest" index.
In your case you would group by entity Id, firm id, etc. and define aggregations, e.g. lastUpdated would be a max aggregatio…
I like to add another option: transform .
A transform in a nutshell is a task that runs aggregation queries and persists the result in an index. In order to join 2 indices you need compatible mappings for grouping, e.g. a key that is named the same way and has the same type. Note, this can be achieved with scripts or runtime fields, however if you want to run in it continuously at scale, it is advised to use proper ordinary mappings.
For the group_by use terms on the common key.
For the aggreg…
To use transform to join multiple indices is not straight forward, but these posts will help you to use transform.