Analyzer plugin needs access to multiple fields

andrewwwooster · November 25, 2015, 2:09pm

I have a Analyzer plugin which uses a TokenFilter to annotates tokens. My fields therefore have a) the original text and b) the corresponding annotation map. I am finding it very difficult to bring these two things together in an Analyzer.

Here's what I have tried:

two fields
I store the text in one field (e.g. text) and the annotation map in another field (e.g. annot).
THE PROBLEM: An Analyzer only sees the fieldName and the field's token stream. So when analyzing the text field, I have not found a way to access the contents of annot field. Is there a way to access other fields in an Analyzer?
single field with delimiter
I store the text and annotation map in a single field, separating with a delimiter. My token filter uses the synonym map during the analyses of the text and ensures that the post delimiter annotation map is filtered out.
THE PROBLEM: While the appended annotation map are not indexed as terms, it still appears in the _source; this means that it can pollute results from the query highlighter.

Any other ideas?

Ivan · November 25, 2015, 4:19pm

Have you looked into using Lucene payloads? Elasticsearch has a payload
filter as well.

Ivan

Topic		Replies	Views
Analysis plugin to access multiple fields from source document Elasticsearch	1	460	July 5, 2017
Cannot configure Multiple analyzers on the same field Elasticsearch	4	381	August 16, 2019
Search multiple fields with “and” operator (but use fields' own analyzers) Elasticsearch	7	2420	July 6, 2017
Elasticsearch - search by two fields. And how to use one text with Text type and custom analyzer Elasticsearch	1	181	May 13, 2023
Analyzer selection on multi-field Elasticsearch	2	381	July 6, 2017

Analyzer plugin needs access to multiple fields

Related topics