Tokenize string based on available field values

We use PHP for processing all data, and PHP is extremely slow with processing large(r) arrays of data so this question is about whether or not elasticsearch can do this internally:

  1. we have an index with list of documents where one of the fields is a list of keywords or keyword-phrases associated with this document; example:
    classify Doc title: "women->beauty"
    classify Doc keywords: ["lipliner", "lip liner", "hair dryer", "mascara", "nail", "eyelash", "curling iron"]

  2. Every time we attempt to classify a product internally, we run an _analyze query to find all matched keyword association against a product title, ie: title: "Philips curling iron for thick hair"

  3. Currently we run a match query against classify Doc keywords to see if anything matches but it results in a lot of noise, because multi-worded keywords like: "Panasonic iron for silk clothing" will "match" classify Doc keywords based on a word "iron" vs what is defined as a phrase in classify Doc keywords "curling iron".


  1. Is it possible to have elasticsearch tokenize search string based on the list of defined keywords?
  2. is it possible to add stemming to #1 to allow "wider" lookups like "Best irons for clothes" "Cheapest curling irons from panasonic"

I figure some sort of analyzer could be used for that, or maybe even ?explain=true might already do something similar?

Thank you.