Hello all!
We use machine learning to score products from an index in our custom elasticsearch search plugin. For this scoring purpose, I need to fetch additional data from external db using product ids, founded by textual matching in a plugin.
We use product id as elasticsearch _id of a document. So we can fetch a product by GET query.
Performance question: what is the most effective way to fetch product id inside the Weight?
Assume we have 20000 matched products.
- _id is a stored field and I can use org.elasticsearch.index.fieldvisitor.FieldsVisitor object to extract product id like this (Like it happen in org.elasticsearch.search.fetch.FetchPhase):
FieldVisitor fv = new FieldVisitor(false);
fv.reset();
fv.postProcess(queryContext.getQueryShardContext().getMapperService());
reader.document(doc, fv);
productId = fv.uid().id();
This case will work more than 1second for 20000 products - If we hold product id as a separate DocValue field and fetch it in usual way.
It will work about 400ms for 20000 products.
The case 2 is faster, but it is also too long. Could you advise me the fastest way?
Regards,
Vadim Gindin