I have two indices with matching fields that I want to use for an ESQL Lookup Join, however, the fields in each index have different names. Am I still able to use a lookup join?
As an example, I have a ‘customers’ index with a document per customer, and each customer doc has an ‘id’. I also have a ‘purchase-history’ index that includes a ‘customerId’ field. I’d like to use a lookup join here but the fields have different names. I’ve tried:
FROM purchase-history
| LOOKUP JOIN customers ON customerId == id
I’m not sure I’m fully understanding. In the example I gave, would the esql look like:
FROM purchase-history
| RENAME customerId AS id
| LOOKUP JOIN customers ON id
| RENAME id AS customerId
But then we’d lose the initial id field from ‘purchase-history’…
Ideally we would amend the ‘customers’ index to either evaluate a copied customerId field or rename its id to customerId, I don’t see how we can do this with the customers index being the right lookup index. Does esql let us amend the right lookup index before we join it somehow?
Thanks @RainTown , looks like that solution would work but I not a fan of moving the docs id to a temp field during the lookup… I can amend the original index mappings though so that both indices have a customerId (the customer’s customerId just being a copy of its id) and that should simplify this.
@leandrojmp yup both indices are lookup indices. I get the following error when I try to use that esql: [esql] > Unexpected error from Elasticsearch: verification_exception - Found ambiguous reference to [id]; matches any of [line 1:1 [id], line 2:15 [id]]
I think the syntax in general works, but it breaks if it uses a common field that is present in both indices… Feels a bit like a bug to me but I may be using it incorrectly.
Yeah, tested as below (tested on 9.3.0), seems to work fine.
Indeed, also a solution. Often it is not practicable to change mappings on existing data.
Either a bug or a feature! See below. It seems the LOOKUP JOIN <index> ON <x> == <y> makes no assumption about which of the 2 indices <x> is in, and <y> is in.
DELETE /customers-tmp
PUT /customers-tmp
{
"settings": {
"index.mode": "lookup",
"number_of_replicas": 0
},
"mappings": {
"properties": {
"id": {"type": "long"},
"name": {"type": "keyword"}
}
}
}
DELETE /purchase-history-tmp
PUT /purchase-history-tmp
{
"settings": {
"index.mode": "standard",
"number_of_replicas": 0
},
"mappings": {
"properties": {
"id": {"type": "keyword"},
"customerid": {"type": "long"},
"price" :{"type": "long"}
}
}
}
PUT /customers-tmp/_doc/1
{ "id": "1","name": "John"}
PUT /customers-tmp/_doc/2
{ "id": "2","name": "James"}
PUT /customers-tmp/_doc/3
{ "id": "3","name": "Peter"}
PUT /customers-tmp/_doc/4
{ "id": "4","name": "Paul"}
PUT /customers-tmp/_doc/5
{ "id": "5","name": "Tom"}
PUT /customers-tmp/_doc/6
{ "id": "6","name": "Dave"}
PUT /purchase-history-tmp/_doc/1
{"customerId": 1,"id": "order#1","price": "20"}
PUT /purchase-history-tmp/_doc/2
{"customerId": 2,"id": "order#2","price": "25"}
PUT /purchase-history-tmp/_doc/3
{"customerId": 4,"id": "order#3","price": "5"}
PUT /purchase-history-tmp/_doc/4
{"customerId": 4,"id": "order#4","price": "7"}
PUT /purchase-history-tmp/_doc/5
{"customerId": 5,"id": "order#5","price": "11"}
PUT /purchase-history-tmp/_doc/6
{"customerId": 5,"id": "order#6","price": "30"}
PUT /purchase-history-tmp/_doc/7
{"customerId": 1,"id": "order#7","price": "9"}
PUT /purchase-history-tmp/_doc/8
{"customerId": 2,"id": "order#8","price": "4"}
PUT /purchase-history-tmp/_doc/9
{"customerId": 3,"id": "order#9","price": "12"}
This works as expected:
FROM purchase-history-tmp
| RENAME id AS ph_id, customerId AS id
| LOOKUP JOIN customers-tmp ON id
| RENAME id AS customerId, ph_id AS id | SORT customerId ASC
Note this also works:
FROM purchase-history-tmp
| RENAME id AS orderId
| LOOKUP JOIN customers-tmp ON id == customerId
and also this:
FROM purchase-history-tmp
| RENAME id AS orderId
| LOOKUP JOIN customers-tmp ON customerId == id
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.