Hi,
I have the following mapping :
"mappings": {
"locations": {
"properties": {
"addresses": {
"type": "nested",
"properties": {
"id": { "type": "string" },
"type": { "type": "string" }
}
}
}
}
}
I already have a DF containing address ids, and I'd like for all of them to get the corresponding document ids by looking for the "id" in addresses fields.
I've found the current solution :
df.select(explode(df("addresses.id")).as("aid"), df("id"))
.join(df_aids, $"aid" === df_aids("id"))
.select(df("id"), df_aids("id"))
I'm concerned about performance. Is it the best way to find documents in df containing in "addresses.id" ids from df_aids ?
Thanks