Need help in modeling and solving this problem through Elastic Search and its NLP features.
We will get a query and we have to extract and out of it.
There are Products are like:
1 JNM Gold Plan
2 JNM Gold & More
3 JNM Platinum
4. KMM Platinum
5 Express Plan
6 Mac and John’s Gold Plan
7 Mac and John’s Platinum
….
...
24....
There are Features are like :
ROI / Rate of Interest
Accelerated Rewards
USP / Milestone Rewards
Late Payment charges
Reward Points
Example of queries:
“Give me reward points for JNM gold plan”
“Tell me about express plan for Mac and John’s gold plan”
Or with typos : “points rawerds KM platanum”
The task is given say the first query “Give me reward points of JNM gold plan” and we have to figure out which and which it is mapping to. So for the given statement its mapping to <product:JNM Gold plan> and <feature:Reward Points> and that is what we need to exactly draw out.
Complexity
Now complexity is added because the statement could be phrased in more than one way e.g. one can :
C1 : Different Sentence form : Write the sentence in different order e.g. “Tell JNM gold what is the rewards point”
C2: Typos and Acronmys : “Give me rwd pts for jnm gld plan”
C3: Overlapping Features : “Give me rewards point for JNM Gold Plan”, should detect feature as and not
I come from image processing background and this is a new area for me, for me it was similar to “feature extraction” image process, but so far I have attempted these and failed:
Used Phonetic matching and it failed badly as it would not do too well with C2 above is too high
Fuzzy query was good at a term level e.g if we give gld, it would match with gold, but if the entire phrase is given “give me reward points for jnm gld plan”, it would not match with “JNM Gold Plan” because of edit distance issues (i guess)
Now as my tricks are not working out, need some direction on how to go about such problems or how to map it.
Will POS tagging would help in such cases or. Can you suggest what approach should be taken to solve this problem of extracting out two category of phrases out of a search query term.