Elasticsearch analyzer

Hi!
I want to know is there any method that allow one field use more than one analyzer?

Here is an example, I got an index call index 1, and a type call product.

under product there are different field like (product_name, brand_name etc.)

And there are there analyzer call Chinese analyzer, English analyzer and stop-word analyzer

let say when i do searching, the product_name is going to be search, and search result follow under the above there analyzer?

Hi @Greentea,

you can use multi-fields for that. Here is an example that uses multiple analyzers and uses Elasticsearch 5 syntax. I specified the standard analyzer explicitly just for demonstration; it is the implicit default:

PUT my_index
{
   "mappings": {
      "product": {
         "properties": {
            "product_name": {
               "type": "text",
               "analyzer": "standard",
               "fields": {
                  "stop": {
                     "type": "text",
                     "analyzer": "stop"
                  },
                  "en": {
                     "type": "text",
                     "analyzer": "english"
                  }
               }
            }
         }
      }
   }
}

You can then refer to the field as product_name and the subfields as product_name.stop and product_name.en. I also recommend the section Getting Started with Languages in the Definitive Guide.

Daniel

Thank you @danielmitterdorfer for your suggestion but I am using the multi-fields already.
Here I want to make some explanation first.
Since my data are mixed with Chinese and English name.
And I have to support simple Chinese, traditional Chinese and English search. So, I decided to convert the simple Chinese to traditional Chinese first (which I used a custom analyzer call "stcn" ), and then go for searching.
Also, I need to support auto complete, and I used the guide of auto complete example (the analyzer trigram and reverse ).
recently, I need to add the stop word and synonym function in elasticsearch (so i created two new analyzer call stop and syno).

And there a problem comes out, I found that the search doesn't support multi analyzer (or I don't how to write the mapping.)

Can you explain more about using multi-field ?

Here is my index setting and mapping

{
  "settings": {
    "index": {
    "analysis": {
    "analyzer": {
      "trigram": {
        "type": "custom",
        "tokenizer": "standard",
        "filter": [
          "standard",
          "shingle",
          "uppercase"
        ]
      },
      "reverse": {
        "type": "custom",
        "tokenizer": "standard",
        "filter": [
          "standard",
          "reverse"
        ]
      },
      "cn": {
        "type": "custom",
        "tokenizer": "icu"
      },
      "stcn": {
        "type": "custom",
        "tokenizer": "stconvert"
      },
      "stop": {
        "type": "custom",
        "tokenizer": "icu",
        "filter": [
          "my_stop_en",
          "my_stop_cn"
        ]
      },
      "syno":{
        "type": "custom",
        "tokenizer": "standard",
        "filter": [
          "synonym"
        ]
      }
    },
    "tokenizer": {
      "icu": {
        "type": "icu_tokenizer"
      },
      "stconvert": {
        "type": "stconvert",
        "delimiter": "/",
        "keep_both": false,
        "convert_type": "s2t"
      }
    },
    "filter": {
      "shingle": {
        "type": "shingle",
        "min_shingle_size": 2,
        "max_shingle_size": 3
      },
      "synonym": {
        "type": "synonym",
        "synonyms_path": "syno/synonym.txt"
      },
      "my_stop_en": {
        "type": "stop",
        "stopwords_path": "stopword/english.txt"
      },
      "my_stop_cn": {
        "type": "stop",
        "stopwords_path": "stopword/chinese.txt"
      }
    }
  }
 }
}


"mappings": {
      "product": {
         "properties": {
        "display_name": {
          "type": "text",
          "include_in_all": true,
      "fields": {
        "stcn": {
          "type": "text",
          "analyzer": "stcn"
        },
        "cn": {
          "type": "text",
          "analyzer": "cn"
        },
        "trigram": {
          "type": "text",
          "analyzer": "trigram"
        },
        "reverse": {
          "type": "text",
          "analyzer": "reverse"
        },
        "stop": {
          "type": "text",
          "analyzer": "stop"
        },
        "syno":{
          "type": "text",
          "analyzer": "syno"
        }
      }
    }

Hi @Greentea,

your mapping looks fine. The section One Language per Field should basically answer your questions.

If you want to use multi-fields in searches, you can use a multi-match query but you'll need to experiment which settings are best for your use-case. Here is a simple example based on your mapping:

GET /my_index/product/_search
{
   "query": {
      "multi_match": {
         "query": "Tom and Jerry",
         "fields": [
            "display_name",
            "display_name.*"
         ],
         "type": "most_fields"
      }
   }
}

display_name.* refers to all your sub-fields. If you want to include only specific ones you can spell out the name, e.g. display_name.reverse.

A minor and unrelated suggestion: You have so much customizations in your mappings that I don't think you need the _all field so you should check if you can disable it to save a bit of disk space (see docs).

Daniel

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.