Hi,
a transform can have multiple sources and they do not need a common schema as long as the field name of the group_by is compatible. So if you have for example a field "customer_id" in both indices "past_purchases" and "predicted_purchases" you can create a transform that joins over both indices:
PUT _transform/prediction_accuracy
{
"source": {
"index": ["past_purchases", "predicted_purchases"]
},
"pivot": {
"group_by": {
"id": {"terms": {
"field": "customer_id"
}}
},
"aggregations": {
...}
}
Note that aggregations are robust w.r.t. to missing fields, so if you have a field in your "past_purchases" index, but not in "predicted_purchases" you can still aggregate on it to e.g. calculate an average (with the correct count under the hood).
I hope this helps!