Synonym token recovery


(stratawing) #1

I asked this a few days ago and got zero responses -- so I'll ask it
again. Note that a response that says "you're a total idiot -- this
isn't possible" is fine -- just need to get a sense of whether this is
possible or not. So here's the question:

Is there any way to return the actual tokens generated by an
index_analyzer at search time? I am using the synonyms filter
(without expansion) to index a particular field, and would like to
get
the standardized value (right-hand-side of the "=>" in my synonyms
file ) for each document that was retrieved.

synonym file includes, for example:

i-pod, i pod => ipod,

when user searches for "i-pod" or "i pod", I want to return the
indexed doc, but also want to return "ipod" as the token that was
created at index time as part of the analysis.
I have my synonyms analyzer up and running, and can get the proper
tokens using the Analyze API.

Many thanks in advance for your thoughts and time.

Cheers!


(Shay Banon) #2

I posted it on the other question (just now) :), but just mention it here as well, you can use the analyze API to get the generated tokens.

On Wednesday, March 14, 2012 at 2:05 AM, stratawing wrote:

I asked this a few days ago and got zero responses -- so I'll ask it
again. Note that a response that says "you're a total idiot -- this
isn't possible" is fine -- just need to get a sense of whether this is
possible or not. So here's the question:

Is there any way to return the actual tokens generated by an
index_analyzer at search time? I am using the synonyms filter
(without expansion) to index a particular field, and would like to
get
the standardized value (right-hand-side of the "=>" in my synonyms
file ) for each document that was retrieved.

synonym file includes, for example:

i-pod, i pod => ipod,

when user searches for "i-pod" or "i pod", I want to return the
indexed doc, but also want to return "ipod" as the token that was
created at index time as part of the analysis.
I have my synonyms analyzer up and running, and can get the proper
tokens using the Analyze API.

Many thanks in advance for your thoughts and time.

Cheers!


(system) #3